Awni Hannun
|
1a28b69ee2
|
only add to residency set once (#2049)
|
2025-04-06 17:38:25 -07:00 |
|
Cheng
|
ba09f01ce8
|
Remove test of converting negative float to uint (#2048)
|
2025-04-06 06:21:46 -07:00 |
|
Cheng
|
6cf48872b7
|
wait_for_one should wait for task to finish (#2047)
|
2025-04-05 20:05:16 -07:00 |
|
Angelos Katharopoulos
|
7b3b8fa000
|
Fix ci release (#2045)
|
2025-04-04 20:25:01 -07:00 |
|
Awni Hannun
|
ec5e2aae61
|
nit in doc (#2044)
|
2025-04-04 12:04:17 -07:00 |
|
Awni Hannun
|
86389bf970
|
patch bump (#2043)
|
2025-04-03 13:15:18 -07:00 |
|
Jagrit Digani
|
3290bfa690
|
Add new sdpa function overload (#2035)
* Add new sdpa function overload
* Address comments
* Remove std::varaint from cpp sdpa function
|
2025-04-03 11:58:28 -07:00 |
|
Jagrit Digani
|
8777fd104f
|
Depthwise Conv2D optimization (#2036)
- Add new specialized kernel for small kernel (kernels size <= 7), small strides (strides <= 2) depthwise 2d convolutions
- Add related tests
|
2025-04-03 09:42:04 -07:00 |
|
Awni Hannun
|
c41f7565ed
|
fix softmax / logsumexp (#2042)
|
2025-04-03 08:32:59 -07:00 |
|
Awni Hannun
|
9ba81e3da4
|
tune quant dispatch (#2031)
|
2025-04-02 20:05:54 -07:00 |
|
Awni Hannun
|
c23888acd7
|
Fix build warning (#2033)
|
2025-04-01 14:42:27 -07:00 |
|
Awni Hannun
|
f98ce25ab9
|
fix residency set for real (#2032)
|
2025-04-01 12:59:48 -07:00 |
|
Awni Hannun
|
de5f38fd48
|
Custom logsumexp (#2028)
* initial custom logsumexp
* more tests
* comments + fix
|
2025-03-31 07:36:55 -07:00 |
|
Angelos Katharopoulos
|
ec2854b13a
|
Swap -inf for finite_minimum value (#2029)
|
2025-03-30 21:55:04 -07:00 |
|
Stephen Panaro
|
90823d2938
|
Add missing funcs to docs (#2021)
|
2025-03-30 18:29:33 -07:00 |
|
Jesper Stemann Andersen
|
5f5770e3a2
|
Fix CPU sign for unsigned ints (#2024)
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
|
2025-03-30 17:56:59 -07:00 |
|
Awni Hannun
|
28f39e9038
|
Log for complex numbers in Metal (#2025)
* Log for complex numbers in Metal
* fix log2
|
2025-03-30 17:04:38 -07:00 |
|
Awni Hannun
|
b2d2b37888
|
fix residency set clearing (#2027)
|
2025-03-30 16:27:26 -07:00 |
|
Awni Hannun
|
fe597e141c
|
add pinv to doc (#2020)
|
2025-03-30 15:54:18 -07:00 |
|
Yi Wang
|
72ca1539e0
|
Remove unused variable in /setup.py (#2026)
This is a follow up of https://github.com/ml-explore/mlx/pull/2011
|
2025-03-30 12:52:33 -07:00 |
|
Awni Hannun
|
13b26775f1
|
use minimum deployment target (#2016)
|
2025-03-28 14:31:53 -07:00 |
|
Awni Hannun
|
05d7118561
|
causal vector sdpa (#2018)
* causal vector sdpa
* get rid of memory threshold
|
2025-03-28 12:36:13 -07:00 |
|
Awni Hannun
|
98b901ad66
|
enable complex gemm (#2017)
|
2025-03-28 10:45:13 -07:00 |
|
Awni Hannun
|
5580b47291
|
iinfo and scalar overflow detection (#2009)
|
2025-03-27 19:54:56 -07:00 |
|
Awni Hannun
|
bc62932984
|
sdpa specialization for head dim 256 (#2007)
|
2025-03-27 19:31:25 -07:00 |
|
Awni Hannun
|
a6b5d6e759
|
revise cmake minimum for doctest (#2014)
|
2025-03-27 19:30:58 -07:00 |
|
Yi Wang
|
a8931306e1
|
Remove unused variable in CMakeBuild (#2011)
Fix https://github.com/ml-explore/mlx/issues/2010
|
2025-03-27 16:00:51 -07:00 |
|
Yi Wang
|
fecdb8717e
|
Polish CONTRIBUTING>md (#2005)
|
2025-03-25 19:06:34 -07:00 |
|
Awni Hannun
|
916fd273ea
|
wire cache (#2006)
|
2025-03-25 18:54:01 -07:00 |
|
Yi Wang
|
0da8506552
|
Update docs for extensions (#2004)
|
2025-03-25 18:35:03 -07:00 |
|
Cheng
|
eda7a7b43e
|
Do not join threads during process exit on Windows (#1738)
|
2025-03-25 06:33:08 -07:00 |
|
Chunyang Wen
|
022eabb734
|
Remove unused import (#1987)
|
2025-03-24 20:19:32 -07:00 |
|
Awni Hannun
|
aba899cef8
|
patch bump (#2000)
|
2025-03-24 12:47:05 -07:00 |
|
Jagrit Digani
|
6a40e1c176
|
Fix looping limit in causal attention (#1999)
|
2025-03-24 12:28:00 -07:00 |
|
Jesper Stemann Andersen
|
9307b2ab8b
|
Fixed 32-bit platform support for distributed/ring implementation (#1996)
Replaced unsigned long integer literals with size_t literals in ring implementation, e.g., 1UL with size_t(1).
|
2025-03-24 08:08:40 -07:00 |
|
Jesper Stemann Andersen
|
522d8d3917
|
Added missing netinet/in.h include that fixes build on FreeBSD (#1997)
Defines IPPROTO_TCP.
|
2025-03-24 08:07:34 -07:00 |
|
Awni Hannun
|
a84cc0123f
|
promote mask when needed (#1998)
|
2025-03-23 19:58:28 -07:00 |
|
Andrey Velichkevich
|
f018e248cd
|
fix(backend): Include algorithm library in Allocator (#1992)
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
|
2025-03-22 21:27:51 -07:00 |
|
Awni Hannun
|
cfd7237a80
|
fix docs (#1991)
|
2025-03-21 19:58:53 -07:00 |
|
Angelos Katharopoulos
|
4eef8102c9
|
Distributed layers (#1270)
|
2025-03-21 13:52:17 -07:00 |
|
Angelos Katharopoulos
|
69e4dd506b
|
Add a ring all gather (#1985)
|
2025-03-21 13:36:51 -07:00 |
|
Angelos Katharopoulos
|
25814a9458
|
Disable mpi on version mismatch (#1989)
|
2025-03-21 13:36:26 -07:00 |
|
Awni Hannun
|
2a980a76ce
|
Add stats and limit to common allocator and enable tests (#1988)
* add stats to common allocator and enable tests
* linux memory and default
* fix
|
2025-03-21 12:28:36 -07:00 |
|
Angelos Katharopoulos
|
d343782c8b
|
Cross platform libmpi loading (#1975)
|
2025-03-21 11:23:10 -07:00 |
|
Awni Hannun
|
4e1994e9d7
|
move memory APIs into top level mlx.core (#1982)
|
2025-03-21 07:25:12 -07:00 |
|
jiyzhang
|
65a38c452b
|
update the formula of smooth_l1_loss (#1986)
|
2025-03-21 06:25:23 -07:00 |
|
Awni Hannun
|
7b7e2352cd
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
|
Awni Hannun
|
1177d28395
|
patch bump (#1981)
|
2025-03-20 15:12:22 -07:00 |
|
Awni Hannun
|
005e7efa64
|
fix mask in sdpa (#1980)
* fix mask in sdpa
* fix attention mask
* Re-enable routing for array mask
---------
Co-authored-by: Jagrit Digani <digani@apple.com>
|
2025-03-20 14:53:12 -07:00 |
|
Jagrit Digani
|
b42d13ec84
|
Update attention tests to show diff, disable array masks (#1978)
|
2025-03-20 14:25:38 -07:00 |
|