Awni Hannun 
							
						 
					 
					
						
						
							
						
						2d0f384b6f 
					 
					
						
						
							
							fix simd erf_inv ( #1896 )  
						
						
						
						
					 
					
						2025-02-24 13:57:47 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						10b271d963 
					 
					
						
						
							
							Ring update ( #1885 )  
						
						
						
						
					 
					
						2025-02-20 14:32:31 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bbda0fdbdb 
					 
					
						
						
							
							Allow non-square lu ( #1889 )  
						
						
						
						
					 
					
						2025-02-20 08:13:23 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c707b2b0a6 
					 
					
						
						
							
							Limit compile buffers ( #1887 )  
						
						... 
						
						
						
						* limit compile buffers
* maybe not flaky test 
						
						
					 
					
						2025-02-19 20:28:13 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						78ba24c37d 
					 
					
						
						
							
							Raise an exception in the rope op if input is integer ( #1884 )  
						
						
						
						
					 
					
						2025-02-19 14:43:39 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						1a2cb72030 
					 
					
						
						
							
							Ensure linspace always contains start and stop ( #1883 )  
						
						
						
						
					 
					
						2025-02-19 13:53:20 -08:00 
						 
				 
			
				
					
						
							
							
								Abe Leininger 
							
						 
					 
					
						
						
							
						
						344a29506e 
					 
					
						
						
							
							Enforce triangular matrix form in tri_inv ( #1876 )  
						
						... 
						
						
						
						* fix tri_inv bug
* Revert "fix tri_inv bug"
This reverts commit b74b290201a_katharopoulos@apple.com > 
						
						
					 
					
						2025-02-19 12:42:33 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						71de73a668 
					 
					
						
						
							
							Fix convs by reverting  #1803  ( #1882 )  
						
						
						
						
					 
					
						2025-02-18 14:36:34 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						4c1dfa58b7 
					 
					
						
						
							
							xor op on arrays ( #1875 )  
						
						
						
						
					 
					
						2025-02-17 00:24:53 -08:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						2dc307f2e6 
					 
					
						
						
							
							Winograd Update for Small batches  ( #1803 )  
						
						... 
						
						
						
						* Build in padding to Winograd kernels
* Add new fused Winograd kernel
* Enable weight flipping in Winograd kernels 
						
						
					 
					
						2025-02-14 13:08:13 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						7f2d1024f3 
					 
					
						
						
							
							add f8_e4m3 loading ( #1859 )  
						
						
						
						
					 
					
						2025-02-13 17:10:03 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						428f589364 
					 
					
						
						
							
							Revert "More buffer donation in some cases ( #1858 )" ( #1863 )  
						
						... 
						
						
						
						This reverts commit d274ae77f2 
						
						
					 
					
						2025-02-13 14:21:44 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						5cd97f7ffe 
					 
					
						
						
							
							Bitwise Inverse ( #1862 )  
						
						... 
						
						
						
						* add bitwise inverse
* add vmap + fix nojit
* inverse -> invert
* add to compile + remove unused 
						
						
					 
					
						2025-02-13 08:44:14 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d274ae77f2 
					 
					
						
						
							
							More buffer donation in some cases ( #1858 )  
						
						... 
						
						
						
						* more donation
* fix
* add test 
						
						
					 
					
						2025-02-12 19:41:37 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						55c5ac7820 
					 
					
						
						
							
							fix int64 bug ( #1860 )  
						
						
						
						
					 
					
						2025-02-12 19:23:46 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						0145911bea 
					 
					
						
						
							
							Fixes output donation for IO ops on the GPU ( #1857 )  
						
						
						
						
					 
					
						2025-02-12 10:52:30 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0a5215693e 
					 
					
						
						
							
							Fix grad copies ( #1854 )  
						
						... 
						
						
						
						* fix grad with copies
* add test
* add test 
						
						
					 
					
						2025-02-11 15:26:42 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						2a45056ba8 
					 
					
						
						
							
							Cycle leak break ( #1856 )  
						
						... 
						
						
						
						* detect and break leaks in custom function
* detect and break leaks in custom function 
						
						
					 
					
						2025-02-11 14:45:02 -08:00 
						 
				 
			
				
					
						
							
							
								Abe Leininger 
							
						 
					 
					
						
						
							
						
						a5ededf1c3 
					 
					
						
						
							
							CPU LU factorization and linear solvers ( #1451 )  
						
						... 
						
						
						
						* linalg solve backend
* nits
* more nits + fix
* luf primitive and lu, solve, and solve_triangular backends
* changes / nits
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-02-10 12:32:24 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						9eb7d7362f 
					 
					
						
						
							
							Fix Split::vmap ( #1845 )  
						
						
						
						
					 
					
						2025-02-08 09:22:13 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1c0c118f7c 
					 
					
						
						
							
							Fp64 on the CPU ( #1843 )  
						
						... 
						
						
						
						* add fp64 data type
* clean build
* update docs
* fix bug 
						
						
					 
					
						2025-02-07 15:52:22 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						af1b725fda 
					 
					
						
						
							
							Fix a couple of slicing bugs ( #1827 )  
						
						... 
						
						
						
						* fix a few bugs
* fix conv grad
* speedup test
* comment 
						
						
					 
					
						2025-02-05 19:50:08 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9174606d4c 
					 
					
						
						
							
							fix sort ( #1835 )  
						
						
						
						
					 
					
						2025-02-05 17:16:27 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ca305afdbe 
					 
					
						
						
							
							loading empty list is ok when strict = false ( #1834 )  
						
						
						
						
					 
					
						2025-02-05 16:19:27 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						f5cc1eea72 
					 
					
						
						
							
							Allow different value dimensions in sdpa_vector ( #1811 )  
						
						
						
						
					 
					
						2025-01-31 20:58:59 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b7c9f1d38f 
					 
					
						
						
							
							scatter axis + gather axis primitives ( #1813 )  
						
						... 
						
						
						
						* scatter axis + gather axis primitives
* add transforms
* comment 
						
						
					 
					
						2025-01-31 20:48:08 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ded914f442 
					 
					
						
						
							
							Small distributed launch helper ( #1810 )  
						
						
						
						
					 
					
						2025-01-29 17:55:04 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4758c8baa1 
					 
					
						
						
							
							Start to cleanup/unify accelerate and common back-ends (Part 1/N) ( #1777 )  
						
						... 
						
						
						
						* start to cleanup/unify accelerate and common back-ends
* more progress
* simplify
* add half type and allow infs in simd exp
* unify softmax + quantized, more dispatches to simd quantized mm
* add sin/cos, use simd in vector-scalar ops
* faster CPU vectorize quant
* faster erf/erfinv 
						
						
					 
					
						2025-01-29 14:34:49 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1017ac4a9e 
					 
					
						
						
							
							add dilation for conv 3d layers + test for 3d conv w/ dilation ( #1802 )  
						
						
						
						
					 
					
						2025-01-28 06:17:07 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ccb61d7aae 
					 
					
						
						
							
							Ring distributed backend ( #1784 )  
						
						
						
						
					 
					
						2025-01-27 22:15:01 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						121d9a0702 
					 
					
						
						
							
							Fix rope fallback to not upcast ( #1797 )  
						
						... 
						
						
						
						* fix rope fallback to not upcast
* Update mlx/fast.cpp
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
					 
					
						2025-01-26 19:07:21 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						72146fc4cd 
					 
					
						
						
							
							Einsum ellipsis ( #1788 )  
						
						
						
						
					 
					
						2025-01-25 01:28:03 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						e6a7ab9675 
					 
					
						
						
							
							non square qr ( #1783 )  
						
						
						
						
					 
					
						2025-01-21 14:07:47 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						90532b1f37 
					 
					
						
						
							
							recompile when shapeless is different ( #1776 )  
						
						
						
						
					 
					
						2025-01-20 21:07:10 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0c259961ac 
					 
					
						
						
							
							matmul jvps ( #1772 )  
						
						
						
						
					 
					
						2025-01-17 10:36:26 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						33421c1dd3 
					 
					
						
						
							
							Limit grad recursion depth by not recursing through non-grad inputs ( #1764 )  
						
						... 
						
						
						
						* limit grad recursion depth
* add grad of module test 
						
						
					 
					
						2025-01-14 14:33:18 -08:00 
						 
				 
			
				
					
						
							
							
								Nripesh Niketan 
							
						 
					 
					
						
						
							
						
						5cc5201914 
					 
					
						
						
							
							feat: Add orthogonal initializer and corresponding tests ( #1651 )  
						
						... 
						
						
						
						* feat: Add orthogonal initializer and corresponding tests
* lint
* Add acknowledgements
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-01-13 07:29:20 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						657f466402 
					 
					
						
						
							
							use sdpa and exportable functions in transformer multi head attention ( #1760 )  
						
						
						
						
					 
					
						2025-01-09 13:11:55 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						c7b0300af5 
					 
					
						
						
							
							Fix batched qmv bug ( #1758 )  
						
						
						
						
					 
					
						2025-01-09 11:45:57 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1ccaf80575 
					 
					
						
						
							
							Dynamic broadcasting for shapeless compile/export ( #1722 )  
						
						... 
						
						
						
						* working towards dynamic broadcast
* shapeless broadcast
* fix build + nits
* use broadcast arrays in quantize matmul
* some cleanup / consistency
* mend
* some comments
* add vjp, jvp for broadcast axes 
						
						
					 
					
						2025-01-09 11:04:24 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d1766f2c70 
					 
					
						
						
							
							Add boolean mask support in vector SDPA ( #1757 )  
						
						
						
						
					 
					
						2025-01-07 20:24:53 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						516ded618b 
					 
					
						
						
							
							Dynamic slicing ( #1741 )  
						
						... 
						
						
						
						* dynamic slice and slice update
* python bindings + tests + fix set item
* fix compile issue
* comment
* fix jit 
						
						
					 
					
						2025-01-07 14:02:16 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d5ec172c95 
					 
					
						
						
							
							Allow boolean mask in sdpa ( #1753 )  
						
						... 
						
						
						
						* allow boolean mask in sdpa
* more permissive donation in ternary 
						
						
					 
					
						2025-01-06 16:57:07 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						058d6ce683 
					 
					
						
						
							
							mpi send use input as output ( #1750 )  
						
						... 
						
						
						
						* mpi send use input as output
* move earlier 
						
						
					 
					
						2025-01-06 06:08:43 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						259025100e 
					 
					
						
						
							
							Fix nd ternary on GPU ( #1746 )  
						
						
						
						
					 
					
						2025-01-03 11:52:17 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						6fa0501387 
					 
					
						
						
							
							Fix concatenate/slice_update vjp + reduce binary size ( #1735 )  
						
						... 
						
						
						
						* fix concatenate vjp + reduce binary size
* also cast in slice update 
						
						
					 
					
						2025-01-02 16:36:33 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ae69cb15e9 
					 
					
						
						
							
							shapeless compile in docs and partially shapeless reshape ( #1742 )  
						
						
						
						
					 
					
						2025-01-02 16:24:42 -08:00 
						 
				 
			
				
					
						
							
							
								Venkata Naga Aditya Datta Chivukula 
							
						 
					 
					
						
						
							
						
						491fa95b1f 
					 
					
						
						
							
							Added Kronecker Product ( #1728 )  
						
						
						
						
					 
					
						2025-01-02 16:00:34 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4ba0c24a8f 
					 
					
						
						
							
							Export / import functions to / from a file ( #1642 )  
						
						... 
						
						
						
						* export and import functions
* refactor + works for few primitives
* nit
* allow primitives with state
* nit
* nit
* simplify serialize / deserialize
* fix for constants
* python bindings
* maybe fix serialize failure case
* add example
* more primitives, training kind of works
* same result for python and c++
* some fixes
* fix export
* template it up
* some simplificatoin
* rebase
* allow kwargs and multiple functions
* exporter
* more primitives for exporting
* deal with endianness
* handle invalid stream
* add docstring 
						
						
					 
					
						2024-12-24 11:19:13 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ebfe64b92d 
					 
					
						
						
							
							shapeless slice update and broadcast when possible ( #1727 )  
						
						
						
						
					 
					
						2024-12-23 11:25:15 -08:00