Awni Hannun
							
						 
					 | 
					
						
						
							
						
						5580b47291
					 | 
					
						
						
							
							iinfo and scalar overflow detection (#2009)
						
						
						
						
						
						
					 | 
					
						2025-03-27 19:54:56 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						a84cc0123f
					 | 
					
						
						
							
							promote mask when needed (#1998)
						
						
						
						
						
						
					 | 
					
						2025-03-23 19:58:28 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						4eef8102c9
					 | 
					
						
						
							
							Distributed layers (#1270)
						
						
						
						
						
						
					 | 
					
						2025-03-21 13:52:17 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						69e4dd506b
					 | 
					
						
						
							
							Add a ring all gather (#1985)
						
						
						
						
						
						
					 | 
					
						2025-03-21 13:36:51 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						2a980a76ce
					 | 
					
						
						
							
							Add stats and limit to common allocator and enable tests (#1988)
						
						
						
						
						
						
						
						* add stats to common allocator and enable tests
* linux memory and default
* fix 
						
						
					 | 
					
						2025-03-21 12:28:36 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						4e1994e9d7
					 | 
					
						
						
							
							move memory APIs into top level mlx.core (#1982)
						
						
						
						
						
						
					 | 
					
						2025-03-21 07:25:12 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						7b7e2352cd
					 | 
					
						
						
							
							fix malloc or wait deadlock (#1976)
						
						
						
						
						
						
					 | 
					
						2025-03-20 16:48:43 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						005e7efa64
					 | 
					
						
						
							
							fix mask in sdpa (#1980)
						
						
						
						
						
						
						
						* fix mask in sdpa
* fix attention mask
* Re-enable routing for array mask
---------
Co-authored-by: Jagrit Digani <digani@apple.com> 
						
						
					 | 
					
						2025-03-20 14:53:12 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jagrit Digani
							
						 
					 | 
					
						
						
							
						
						b42d13ec84
					 | 
					
						
						
							
							Update attention tests to show diff, disable array masks (#1978)
						
						
						
						
						
						
					 | 
					
						2025-03-20 14:25:38 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jagrit Digani
							
						 
					 | 
					
						
						
							
						
						9adcd1a650
					 | 
					
						
						
							
							Support fused masking in Attention (#1924)
						
						
						
						
						
						
						
						* Update API to allow mask='causal' in fast::sdpa
* Add fallback
* Update steel::AttnParams
* Fix typo
* WIP, basic causal
* Update tests
* Update benchmarking
* Update masking loop limits
* Add bool masking and update tests
* Update additive mask
* Update benchmarks
* Update benchmarks
* Update tests
* Update for bfloat error
* Update early exit
* Add random seed to tests 
						
						
					 | 
					
						2025-03-20 11:01:32 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						3c164fca8c
					 | 
					
						
						
							
							Fix multistream GPU deadlock (#1969)
						
						
						
						
						
						
						
						* fix multistream GPU deadlock
* comments 
						
						
					 | 
					
						2025-03-20 07:19:47 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						c6ea2ba329
					 | 
					
						
						
							
							Use same accumulation precision in gemv as gemm (#1962)
						
						
						
						
						
						
						
						* use same accumulation precision in gemv as gemm
* faster
* fix compile 
						
						
					 | 
					
						2025-03-16 07:13:24 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						2770a10240
					 | 
					
						
						
							
							fix grad with inplace updates (#1961)
						
						
						
						
						
						
					 | 
					
						2025-03-13 19:13:09 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						32da94507a
					 | 
					
						
						
							
							fix vmap for flatten (#1955)
						
						
						
						
						
						
					 | 
					
						2025-03-11 10:42:22 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						3c3e558c60
					 | 
					
						
						
							
							Support transposed head/seq for kv (#1950)
						
						
						
						
						
						
						
						* support transposed head/seq for kv
* fix flaky test
* nit 
						
						
					 | 
					
						2025-03-10 10:53:45 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Abe Leininger
							
						 
					 | 
					
						
						
							
						
						3835a428c5
					 | 
					
						
						
							
							Adds nuclear norm support (#1894)
						
						
						
						
						
						
						
						* adjust norm unit test tolerance 
						
						
					 | 
					
						2025-03-04 13:26:02 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						9680f72cca
					 | 
					
						
						
							
							Add a multi optimizer (#1916)
						
						
						
						
						
						
					 | 
					
						2025-03-04 13:16:35 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						e613d0eaf0
					 | 
					
						
						
							
							SDPA support for small batch (over sequence) queries (#1922)
						
						
						
						
						
						
						
						* batch query sdpa
* batch sdpa for query 
						
						
					 | 
					
						2025-03-04 10:59:04 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						6bcd6bcf70
					 | 
					
						
						
							
							fix donation in scan (#1917)
						
						
						
						
						
						
					 | 
					
						2025-03-03 11:30:59 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						4e7cd31d12
					 | 
					
						
						
							
							Fix slice data size (#1913)
						
						
						
						
						
						
						
						* fix slice data size
* add test 
						
						
					 | 
					
						2025-03-02 21:50:42 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						5e6c130d93
					 | 
					
						
						
							
							RMS norm without scaling (#1915)
						
						
						
						
						
						
					 | 
					
						2025-02-28 20:26:57 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						7d042f17fe
					 | 
					
						
						
							
							Double for lapack (#1904)
						
						
						
						
						
						
						
						* double for lapack ops
* add double support for lapack ops 
						
						
					 | 
					
						2025-02-25 11:39:36 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						28b8079e30
					 | 
					
						
						
							
							fix double type promotion (#1901)
						
						
						
						
						
						
					 | 
					
						2025-02-25 06:00:53 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						7face5d9fd
					 | 
					
						
						
							
							fix cpu compile (#1897)
						
						
						
						
						
						
					 | 
					
						2025-02-24 14:10:30 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						2d0f384b6f
					 | 
					
						
						
							
							fix simd erf_inv (#1896)
						
						
						
						
						
						
					 | 
					
						2025-02-24 13:57:47 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						10b271d963
					 | 
					
						
						
							
							Ring update (#1885)
						
						
						
						
						
						
					 | 
					
						2025-02-20 14:32:31 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						bbda0fdbdb
					 | 
					
						
						
							
							Allow non-square lu (#1889)
						
						
						
						
						
						
					 | 
					
						2025-02-20 08:13:23 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						c707b2b0a6
					 | 
					
						
						
							
							Limit compile buffers (#1887)
						
						
						
						
						
						
						
						* limit compile buffers
* maybe not flaky test 
						
						
					 | 
					
						2025-02-19 20:28:13 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						78ba24c37d
					 | 
					
						
						
							
							Raise an exception in the rope op if input is integer (#1884)
						
						
						
						
						
						
					 | 
					
						2025-02-19 14:43:39 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						1a2cb72030
					 | 
					
						
						
							
							Ensure linspace always contains start and stop (#1883)
						
						
						
						
						
						
					 | 
					
						2025-02-19 13:53:20 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Abe Leininger
							
						 
					 | 
					
						
						
							
						
						344a29506e
					 | 
					
						
						
							
							Enforce triangular matrix form in tri_inv (#1876)
						
						
						
						
						
						
						
						* fix tri_inv bug
* Revert "fix tri_inv bug"
This reverts commit b74b290201.
* Make sure that tri_inv returns a triangular matrix
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com> 
						
						
					 | 
					
						2025-02-19 12:42:33 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						71de73a668
					 | 
					
						
						
							
							Fix convs by reverting #1803 (#1882)
						
						
						
						
						
						
					 | 
					
						2025-02-18 14:36:34 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						4c1dfa58b7
					 | 
					
						
						
							
							xor op on arrays (#1875)
						
						
						
						
						
						
					 | 
					
						2025-02-17 00:24:53 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jagrit Digani
							
						 
					 | 
					
						
						
							
						
						2dc307f2e6
					 | 
					
						
						
							
							Winograd Update for Small batches  (#1803)
						
						
						
						
						
						
						
						* Build in padding to Winograd kernels
* Add new fused Winograd kernel
* Enable weight flipping in Winograd kernels 
						
						
					 | 
					
						2025-02-14 13:08:13 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						7f2d1024f3
					 | 
					
						
						
							
							add f8_e4m3 loading (#1859)
						
						
						
						
						
						
					 | 
					
						2025-02-13 17:10:03 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						428f589364
					 | 
					
						
						
							
							Revert "More buffer donation in some cases (#1858)" (#1863)
						
						
						
						
						
						
						
						This reverts commit d274ae77f2. 
						
						
					 | 
					
						2025-02-13 14:21:44 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						5cd97f7ffe
					 | 
					
						
						
							
							Bitwise Inverse (#1862)
						
						
						
						
						
						
						
						* add bitwise inverse
* add vmap + fix nojit
* inverse -> invert
* add to compile + remove unused 
						
						
					 | 
					
						2025-02-13 08:44:14 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						d274ae77f2
					 | 
					
						
						
							
							More buffer donation in some cases (#1858)
						
						
						
						
						
						
						
						* more donation
* fix
* add test 
						
						
					 | 
					
						2025-02-12 19:41:37 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						55c5ac7820
					 | 
					
						
						
							
							fix int64 bug (#1860)
						
						
						
						
						
						
					 | 
					
						2025-02-12 19:23:46 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						0145911bea
					 | 
					
						
						
							
							Fixes output donation for IO ops on the GPU (#1857)
						
						
						
						
						
						
					 | 
					
						2025-02-12 10:52:30 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						0a5215693e
					 | 
					
						
						
							
							Fix grad copies (#1854)
						
						
						
						
						
						
						
						* fix grad with copies
* add test
* add test 
						
						
					 | 
					
						2025-02-11 15:26:42 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						2a45056ba8
					 | 
					
						
						
							
							Cycle leak break (#1856)
						
						
						
						
						
						
						
						* detect and break leaks in custom function
* detect and break leaks in custom function 
						
						
					 | 
					
						2025-02-11 14:45:02 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Abe Leininger
							
						 
					 | 
					
						
						
							
						
						a5ededf1c3
					 | 
					
						
						
							
							CPU LU factorization and linear solvers (#1451)
						
						
						
						
						
						
						
						* linalg solve backend
* nits
* more nits + fix
* luf primitive and lu, solve, and solve_triangular backends
* changes / nits
---------
Co-authored-by: Awni Hannun <awni@apple.com> 
						
						
					 | 
					
						2025-02-10 12:32:24 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						9eb7d7362f
					 | 
					
						
						
							
							Fix Split::vmap (#1845)
						
						
						
						
						
						
					 | 
					
						2025-02-08 09:22:13 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						1c0c118f7c
					 | 
					
						
						
							
							Fp64 on the CPU (#1843)
						
						
						
						
						
						
						
						* add fp64 data type
* clean build
* update docs
* fix bug 
						
						
					 | 
					
						2025-02-07 15:52:22 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						af1b725fda
					 | 
					
						
						
							
							Fix a couple of slicing bugs (#1827)
						
						
						
						
						
						
						
						* fix a few bugs
* fix conv grad
* speedup test
* comment 
						
						
					 | 
					
						2025-02-05 19:50:08 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						9174606d4c
					 | 
					
						
						
							
							fix sort (#1835)
						
						
						
						
						
						
					 | 
					
						2025-02-05 17:16:27 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						ca305afdbe
					 | 
					
						
						
							
							loading empty list is ok when strict = false (#1834)
						
						
						
						
						
						
					 | 
					
						2025-02-05 16:19:27 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						f5cc1eea72
					 | 
					
						
						
							
							Allow different value dimensions in sdpa_vector (#1811)
						
						
						
						
						
						
					 | 
					
						2025-01-31 20:58:59 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						b7c9f1d38f
					 | 
					
						
						
							
							scatter axis + gather axis primitives (#1813)
						
						
						
						
						
						
						
						* scatter axis + gather axis primitives
* add transforms
* comment 
						
						
					 | 
					
						2025-01-31 20:48:08 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
							
							
						
					 |