Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ded914f442 
					 
					
						
						
							
							Small distributed launch helper ( #1810 )  
						
						
						
						
					 
					
						2025-01-29 17:55:04 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4758c8baa1 
					 
					
						
						
							
							Start to cleanup/unify accelerate and common back-ends (Part 1/N) ( #1777 )  
						
						... 
						
						
						
						* start to cleanup/unify accelerate and common back-ends
* more progress
* simplify
* add half type and allow infs in simd exp
* unify softmax + quantized, more dispatches to simd quantized mm
* add sin/cos, use simd in vector-scalar ops
* faster CPU vectorize quant
* faster erf/erfinv 
						
						
					 
					
						2025-01-29 14:34:49 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1017ac4a9e 
					 
					
						
						
							
							add dilation for conv 3d layers + test for 3d conv w/ dilation ( #1802 )  
						
						
						
						
					 
					
						2025-01-28 06:17:07 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ccb61d7aae 
					 
					
						
						
							
							Ring distributed backend ( #1784 )  
						
						
						
						
					 
					
						2025-01-27 22:15:01 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						121d9a0702 
					 
					
						
						
							
							Fix rope fallback to not upcast ( #1797 )  
						
						... 
						
						
						
						* fix rope fallback to not upcast
* Update mlx/fast.cpp
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
					 
					
						2025-01-26 19:07:21 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						72146fc4cd 
					 
					
						
						
							
							Einsum ellipsis ( #1788 )  
						
						
						
						
					 
					
						2025-01-25 01:28:03 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						e6a7ab9675 
					 
					
						
						
							
							non square qr ( #1783 )  
						
						
						
						
					 
					
						2025-01-21 14:07:47 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						90532b1f37 
					 
					
						
						
							
							recompile when shapeless is different ( #1776 )  
						
						
						
						
					 
					
						2025-01-20 21:07:10 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0c259961ac 
					 
					
						
						
							
							matmul jvps ( #1772 )  
						
						
						
						
					 
					
						2025-01-17 10:36:26 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						33421c1dd3 
					 
					
						
						
							
							Limit grad recursion depth by not recursing through non-grad inputs ( #1764 )  
						
						... 
						
						
						
						* limit grad recursion depth
* add grad of module test 
						
						
					 
					
						2025-01-14 14:33:18 -08:00 
						 
				 
			
				
					
						
							
							
								Nripesh Niketan 
							
						 
					 
					
						
						
							
						
						5cc5201914 
					 
					
						
						
							
							feat: Add orthogonal initializer and corresponding tests ( #1651 )  
						
						... 
						
						
						
						* feat: Add orthogonal initializer and corresponding tests
* lint
* Add acknowledgements
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-01-13 07:29:20 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						657f466402 
					 
					
						
						
							
							use sdpa and exportable functions in transformer multi head attention ( #1760 )  
						
						
						
						
					 
					
						2025-01-09 13:11:55 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						c7b0300af5 
					 
					
						
						
							
							Fix batched qmv bug ( #1758 )  
						
						
						
						
					 
					
						2025-01-09 11:45:57 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1ccaf80575 
					 
					
						
						
							
							Dynamic broadcasting for shapeless compile/export ( #1722 )  
						
						... 
						
						
						
						* working towards dynamic broadcast
* shapeless broadcast
* fix build + nits
* use broadcast arrays in quantize matmul
* some cleanup / consistency
* mend
* some comments
* add vjp, jvp for broadcast axes 
						
						
					 
					
						2025-01-09 11:04:24 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d1766f2c70 
					 
					
						
						
							
							Add boolean mask support in vector SDPA ( #1757 )  
						
						
						
						
					 
					
						2025-01-07 20:24:53 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						516ded618b 
					 
					
						
						
							
							Dynamic slicing ( #1741 )  
						
						... 
						
						
						
						* dynamic slice and slice update
* python bindings + tests + fix set item
* fix compile issue
* comment
* fix jit 
						
						
					 
					
						2025-01-07 14:02:16 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d5ec172c95 
					 
					
						
						
							
							Allow boolean mask in sdpa ( #1753 )  
						
						... 
						
						
						
						* allow boolean mask in sdpa
* more permissive donation in ternary 
						
						
					 
					
						2025-01-06 16:57:07 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						058d6ce683 
					 
					
						
						
							
							mpi send use input as output ( #1750 )  
						
						... 
						
						
						
						* mpi send use input as output
* move earlier 
						
						
					 
					
						2025-01-06 06:08:43 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						259025100e 
					 
					
						
						
							
							Fix nd ternary on GPU ( #1746 )  
						
						
						
						
					 
					
						2025-01-03 11:52:17 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						6fa0501387 
					 
					
						
						
							
							Fix concatenate/slice_update vjp + reduce binary size ( #1735 )  
						
						... 
						
						
						
						* fix concatenate vjp + reduce binary size
* also cast in slice update 
						
						
					 
					
						2025-01-02 16:36:33 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ae69cb15e9 
					 
					
						
						
							
							shapeless compile in docs and partially shapeless reshape ( #1742 )  
						
						
						
						
					 
					
						2025-01-02 16:24:42 -08:00 
						 
				 
			
				
					
						
							
							
								Venkata Naga Aditya Datta Chivukula 
							
						 
					 
					
						
						
							
						
						491fa95b1f 
					 
					
						
						
							
							Added Kronecker Product ( #1728 )  
						
						
						
						
					 
					
						2025-01-02 16:00:34 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4ba0c24a8f 
					 
					
						
						
							
							Export / import functions to / from a file ( #1642 )  
						
						... 
						
						
						
						* export and import functions
* refactor + works for few primitives
* nit
* allow primitives with state
* nit
* nit
* simplify serialize / deserialize
* fix for constants
* python bindings
* maybe fix serialize failure case
* add example
* more primitives, training kind of works
* same result for python and c++
* some fixes
* fix export
* template it up
* some simplificatoin
* rebase
* allow kwargs and multiple functions
* exporter
* more primitives for exporting
* deal with endianness
* handle invalid stream
* add docstring 
						
						
					 
					
						2024-12-24 11:19:13 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ebfe64b92d 
					 
					
						
						
							
							shapeless slice update and broadcast when possible ( #1727 )  
						
						
						
						
					 
					
						2024-12-23 11:25:15 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0308e9af71 
					 
					
						
						
							
							Allow offset to be an mx.array for mx.fast.rope ( #1724 )  
						
						... 
						
						
						
						* allow offset for rope
* comment 
						
						
					 
					
						2024-12-19 15:51:44 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c3628eea49 
					 
					
						
						
							
							Add mx.finfo and use it when making causal mask ( #1726 )  
						
						... 
						
						
						
						* finfo
* fixes
* docs 
						
						
					 
					
						2024-12-19 14:52:41 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d03c01dfbc 
					 
					
						
						
							
							fix unflatten vjp ( #1708 )  
						
						
						
						
					 
					
						2024-12-16 18:37:57 -08:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						af5a614aad 
					 
					
						
						
							
							Eval before cleanup so model file is unlocked ( #1702 )  
						
						
						
						
					 
					
						2024-12-14 21:41:49 -08:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						dfccd17ab9 
					 
					
						
						
							
							Use psutil to get memory info on Windows ( #1700 )  
						
						
						
						
					 
					
						2024-12-13 19:50:13 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9111999af3 
					 
					
						
						
							
							Fix small sort with metal validation ( #1695 )  
						
						
						
						
					 
					
						2024-12-12 09:21:45 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						6bd28d246e 
					 
					
						
						
							
							Allow no copy negative strides in as_strided and slice ( #1688 )  
						
						... 
						
						
						
						* allow no copy negative strides in as_strided and slice
* fix jit
* fix jit 
						
						
					 
					
						2024-12-12 08:59:45 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4e1e9520e1 
					 
					
						
						
							
							Flatten and unflatten ( #1692 )  
						
						... 
						
						
						
						* flatten and unflatten
* fix grad
* fix shape infer
* use squeeze + unsqueeze in get_item 
						
						
					 
					
						2024-12-11 21:51:37 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f76a49e555 
					 
					
						
						
							
							ExpandDims primitive (#1687 )  
						
						... 
						
						
						
						* add squeeze primitive
* simplify squeeze, use in gather
* fix
* fix
* fix
* fix
* fix no cpu
* use squeeze in matmul and friends
* expand dims primitive
* comment 
						
						
					 
					
						2024-12-10 16:39:07 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						29a620cab2 
					 
					
						
						
							
							No reshapes in quantized embedding ( #1682 )  
						
						... 
						
						
						
						* no reshapes in quantized embedding
* fix inadvertant cast
* add tol 
						
						
					 
					
						2024-12-09 18:57:38 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						35b412c099 
					 
					
						
						
							
							Fix compile hasher for string constants. ( #1677 )  
						
						... 
						
						
						
						* fix hash
* add test
* nit 
						
						
					 
					
						2024-12-09 09:26:18 -08:00 
						 
				 
			
				
					
						
							
							
								mt_caret 
							
						 
					 
					
						
						
							
						
						fd3377dd1f 
					 
					
						
						
							
							Support bias correction in Adam and AdamW optimizers ( #1640 )  
						
						
						
						
					 
					
						2024-12-06 12:13:34 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bc2a29f033 
					 
					
						
						
							
							fix ( #1654 )  
						
						
						
						
					 
					
						2024-12-06 10:48:58 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						c79f6a4a8c 
					 
					
						
						
							
							3 and 6 bit quantization ( #1613 )  
						
						... 
						
						
						
						* Support 3 and 6 bit quantization 
						
						
					 
					
						2024-11-22 10:22:13 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0c5eea226b 
					 
					
						
						
							
							Reduce specializations ( #1607 )  
						
						... 
						
						
						
						* start of reduce specializations
* fix all reduce
* fix many dims
* fix
* non-jit tests clear
* cleanup instantiations
* cpu merges
* change dim specializations
* optimize
* fix jit
* fix jit
* use higher precision for integer sum+prod
* fixes 
						
						
					 
					
						2024-11-21 19:53:00 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						d8c824c594 
					 
					
						
						
							
							Formatting fixes ( #1606 )  
						
						
						
						
					 
					
						2024-11-20 15:30:36 -08:00 
						 
				 
			
				
					
						
							
							
								Saanidhya 
							
						 
					 
					
						
						
							
						
						cb431dfc9f 
					 
					
						
						
							
							Adds 3D pooling ( #1526 )  
						
						
						
						
					 
					
						2024-11-19 16:45:24 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						61d787726a 
					 
					
						
						
							
							Fix view scalar bug segfault ( #1603 )  
						
						... 
						
						
						
						* fix view scalar bug
* fix view scalar bug
* one more fix 
						
						
					 
					
						2024-11-19 10:54:05 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						5e89aace9b 
					 
					
						
						
							
							Fix concatenate vmap ( #1600 )  
						
						
						
						
					 
					
						2024-11-19 10:44:04 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bf481e8e5d 
					 
					
						
						
							
							Fix sibling leak ( #1590 )  
						
						... 
						
						
						
						* add test
* fix + test
* fix fix 
						
						
					 
					
						2024-11-18 19:17:01 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9bd03dd9b4 
					 
					
						
						
							
							More buffer donation with no-ops ( #1591 )  
						
						... 
						
						
						
						* more donation
* fix test
* fix build 
						
						
					 
					
						2024-11-18 08:35:41 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8c34c9dac4 
					 
					
						
						
							
							throw for invalid case and remove test ( #1575 )  
						
						
						
						
					 
					
						2024-11-08 12:04:03 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						91c0277356 
					 
					
						
						
							
							fix per-example mask + docs in sdpa ( #1574 )  
						
						
						
						
					 
					
						2024-11-08 11:51:15 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						59247c2b62 
					 
					
						
						
							
							add groups in conv2d ( #1569 )  
						
						
						
						
					 
					
						2024-11-07 13:57:53 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						54f05e7195 
					 
					
						
						
							
							Fix gather vmap ( #1563 )  
						
						... 
						
						
						
						* fix gather
* fix 
						
						
					 
					
						2024-11-05 11:29:20 -08:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						26be608470 
					 
					
						
						
							
							Add split_k qvm for long context ( #1564 )  
						
						... 
						
						
						
						* Add splitk qvm
* configurable splitk
* tuning
* remove extra instantiation
* remove refactor
* separate test
* cpu tolerance 
						
						
					 
					
						2024-11-05 11:25:19 -08:00