Awni Hannun
736a340478
reduce binary size ( #1952 )
2025-03-11 06:30:44 -07:00
Awni Hannun
117e1355a2
fix copy for large arrays ( #1953 )
2025-03-10 15:04:25 -07:00
Awni Hannun
3c3e558c60
Support transposed head/seq for kv ( #1950 )
...
* support transposed head/seq for kv
* fix flaky test
* nit
2025-03-10 10:53:45 -07:00
Chunyang Wen
cffceda6ee
Add type hint for _extra_repr ( #1948 )
2025-03-10 06:05:36 -07:00
Chunyang Wen
048805ad2c
Remove unused modules ( #1949 )
2025-03-10 06:05:26 -07:00
Chunyang Wen
d14c9fe7ea
Add file info when raising errors in save ( #1943 )
2025-03-08 14:51:04 -08:00
Chunyang Wen
5db90ce822
Fix obsured warning ( #1944 )
2025-03-08 14:50:39 -08:00
Chunyang Wen
d699cc1330
Fix unreachable warning ( #1939 )
...
* Fix unreachable warning
* Update error message
2025-03-07 17:23:04 -08:00
Awni Hannun
c4230747a1
redesign for faster cpu/gpu synch ( #1869 )
...
* redesign for faster cpu/gpu synch
* load + more async CPU
* use command encoder API and move more ops to use it
* make fence back-end generic + CPU only fence
* faster build
* fix async eval
* fixes + handle temporaries
* fix / improve cpu conv
* remove unused status, fix siblings
* fix extensions
* fix
* fix no cpu build
* format
* comments
* fix perf regression, remove unecessary abort
* fix events, task limit cpu
* fix waiting
* fix donation / temporaries in normalization
2025-03-06 19:23:38 -08:00
Awni Hannun
5245f12a46
always use json ( #1938 )
2025-03-06 15:35:56 -08:00
Chunyang Wen
a198b2787e
Remove unused modules ( #1936 )
2025-03-06 14:20:27 -08:00
Chunyang Wen
04edad8c59
Add doc string for path ( #1937 )
2025-03-06 14:20:09 -08:00
David Wisdom
392b3060b0
Fix typo in randint docstring ( #1932 )
...
This commit fixes a typo in the docstring for mlx.core.random.randint() by changing "roadcastable" to "broadcastable".
2025-03-05 21:48:00 -08:00
Chunyang Wen
85b34d59bc
Clean unused sys ( #1929 )
2025-03-05 13:48:03 -08:00
Awni Hannun
f599c11bc8
bump ( #1931 )
2025-03-05 13:16:53 -08:00
Angelos Katharopoulos
0792ff02ff
Only fail when 10 consecutive socket errors occur ( #1928 )
2025-03-05 13:16:19 -08:00
Alex Barron
fd0d63ba5b
Affine quant always in fp32 ( #1925 )
...
* do affine quant in fp32
* static cast
2025-03-04 17:50:19 -08:00
Abe Leininger
3835a428c5
Adds nuclear norm support ( #1894 )
...
* adjust norm unit test tolerance
2025-03-04 13:26:02 -08:00
Angelos Katharopoulos
9680f72cca
Add a multi optimizer ( #1916 )
2025-03-04 13:16:35 -08:00
Angelos Katharopoulos
a0737273d3
Allow debugging in distributed mode ( #1920 )
2025-03-04 13:01:10 -08:00
Awni Hannun
e613d0eaf0
SDPA support for small batch (over sequence) queries ( #1922 )
...
* batch query sdpa
* batch sdpa for query
2025-03-04 10:59:04 -08:00
Awni Hannun
6bcd6bcf70
fix donation in scan ( #1917 )
2025-03-03 11:30:59 -08:00
Awni Hannun
ba12e4999a
Use a heap for small sizes ( #1911 )
...
* use a heap for small sizes
* check if VM
2025-03-03 06:50:57 -08:00
Awni Hannun
4e7cd31d12
Fix slice data size ( #1913 )
...
* fix slice data size
* add test
2025-03-02 21:50:42 -08:00
Angelos Katharopoulos
5e6c130d93
RMS norm without scaling ( #1915 )
2025-02-28 20:26:57 -08:00
Angelos Katharopoulos
5d68082881
Ring docs ( #1829 )
2025-02-28 11:34:21 -08:00
Angelos Katharopoulos
607181644f
Add mlx.distributed_config script ( #1902 )
2025-02-28 11:16:39 -08:00
Jagrit Digani
89d327075f
Enabling fused attention for head dim 128 ( #1899 )
...
* Share KV smem
* Fix bfloat error
* Unroll O = S @ V loop
* Perf upgrade
* Remove commented out function
* Add -Wno-c++17-extensions flag to metal flags
* Add -Wno-c++17-extensions flag to metal extension flags
2025-02-26 10:02:06 -08:00
Angelos Katharopoulos
6bf00ef631
Fix ring of 2 and allow scalars in API ( #1906 )
2025-02-25 17:03:01 -08:00
Awni Hannun
7d042f17fe
Double for lapack ( #1904 )
...
* double for lapack ops
* add double support for lapack ops
2025-02-25 11:39:36 -08:00
Awni Hannun
28b8079e30
fix double type promotion ( #1901 )
2025-02-25 06:00:53 -08:00
Awni Hannun
7face5d9fd
fix cpu compile ( #1897 )
2025-02-24 14:10:30 -08:00
Awni Hannun
a44dc4bdb0
fix leaking objc ( #1898 )
2025-02-24 13:57:59 -08:00
Awni Hannun
2d0f384b6f
fix simd erf_inv ( #1896 )
2025-02-24 13:57:47 -08:00
Awni Hannun
8ff84b5c43
fix version and expose command queue getter ( #1892 )
2025-02-20 15:25:15 -08:00
Angelos Katharopoulos
10b271d963
Ring update ( #1885 )
2025-02-20 14:32:31 -08:00
Jesper Stemann Andersen
0ebc8a3d25
Fixed issue where Clang on FreeBSD failed to compile mlx/backend/cpu/quantized.cpp ( #1890 )
2025-02-20 12:02:12 -08:00
Awni Hannun
bbda0fdbdb
Allow non-square lu ( #1889 )
2025-02-20 08:13:23 -08:00
Jesper Stemann Andersen
c86422bdd4
Added mlx::core::version() returning std::string(MLX_VERSION) ( #1819 )
...
* Added version.h providing mlx::core::version() returning std::string(MLX_VERSION)
Also, added MLX_VERSION_MAJOR, MLX_VERSION_MINOR, MLX_VERSION_PATCH, MLX_VERSION_NUMERIC, and accompanying functions.
* Added version.h to mlx.h
* Changed version int functions to be constexpr
* Formatting
* Added handling of MLX_VERSION where only the prefix has major.minor.patch format
* Changed version function to be constexpr
2025-02-19 20:30:19 -08:00
Awni Hannun
c707b2b0a6
Limit compile buffers ( #1887 )
...
* limit compile buffers
* maybe not flaky test
2025-02-19 20:28:13 -08:00
Angelos Katharopoulos
78ba24c37d
Raise an exception in the rope op if input is integer ( #1884 )
2025-02-19 14:43:39 -08:00
Angelos Katharopoulos
1a2cb72030
Ensure linspace always contains start and stop ( #1883 )
2025-02-19 13:53:20 -08:00
Abe Leininger
344a29506e
Enforce triangular matrix form in tri_inv
( #1876 )
...
* fix tri_inv bug
* Revert "fix tri_inv bug"
This reverts commit b74b290201
.
* Make sure that tri_inv returns a triangular matrix
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2025-02-19 12:42:33 -08:00
Angelos Katharopoulos
71de73a668
Fix convs by reverting #1803 ( #1882 )
2025-02-18 14:36:34 -08:00
Alex Barron
4c1dfa58b7
xor op on arrays ( #1875 )
2025-02-17 00:24:53 -08:00
Awni Hannun
5274c3c43f
compiler warnings are errors ( #1870 )
2025-02-17 00:07:49 -08:00
Angelos Katharopoulos
1762793989
Remove unused uniform ( #1867 )
2025-02-14 15:51:41 -08:00
Awni Hannun
6cec78d8f2
bump ( #1866 )
2025-02-14 13:09:34 -08:00
Jagrit Digani
2dc307f2e6
Winograd Update for Small batches ( #1803 )
...
* Build in padding to Winograd kernels
* Add new fused Winograd kernel
* Enable weight flipping in Winograd kernels
2025-02-14 13:08:13 -08:00
Awni Hannun
7aea5b1895
Allow dynamic ops per buffer based on dispatches and memory ( #1864 )
...
* Allow dynamic ops per buffer based on dispatches and memory
* add initial arch values
2025-02-13 19:18:22 -08:00