Awni Hannun 
							
						 
					 
					
						
						
							
						
						380aeb58ae 
					 
					
						
						
							
							enable admm low-precision cpu ( #2661 )  
						
						 
						
						
						
						
					 
					
						2025-10-10 09:50:54 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Cheng 
							
						 
					 
					
						
						
							
						
						6a3acf2301 
					 
					
						
						
							
							[CUDA] Set bias as input when using bias epilogue ( #2584 )  
						
						 
						
						
						
						
					 
					
						2025-09-11 15:31:09 +09:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Cheng 
							
						 
					 
					
						
						
							
						
						52b8384d10 
					 
					
						
						
							
							Fix flaky addmm tests ( #2581 )  
						
						 
						
						
						
						
					 
					
						2025-09-10 14:22:22 +09:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Cheng 
							
						 
					 
					
						
						
							
						
						44cc5da4bc 
					 
					
						
						
							
							[CUDA] Fix alpha not respected when using bias epilogue ( #2578 )  
						
						 
						
						
						
						
					 
					
						2025-09-10 09:08:01 +09:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d32519c8ee 
					 
					
						
						
							
							fix gemv regression ( #2445 )  
						
						 
						
						
						
						
					 
					
						2025-07-30 14:23:01 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Gökdeniz Gülmez 
							
						 
					 
					
						
						
							
						
						deee214a95 
					 
					
						
						
							
							Adding support for the Muon Optimizer ( #1914 )  
						
						 
						
						... 
						
						
						
						* initial commit with workong optmimizer
* update ACKNOWLEDGMENTS.md
* nits and adding it to test
* nits
* G.astype(mx.bfloat16) to G.astype(G.dtype)
* G.ndim >= 2 to assert G.ndim == 2
* remove coments
* replace with  mx.addmm
* remove comments
* format
* nits
* match muon
* fix addmm
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-07-18 12:25:28 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						4a9b29a875 
					 
					
						
						
							
							MoE backward improvements ( #2335 )  
						
						 
						
						
						
						
					 
					
						2025-07-07 17:59:53 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4fda5fbdf9 
					 
					
						
						
							
							add python testing for cuda with ability to skip list of tests ( #2295 )  
						
						 
						
						
						
						
					 
					
						2025-06-15 10:56:48 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8402a2acf4 
					 
					
						
						
							
							Fix complex power and print ( #2286 )  
						
						 
						
						... 
						
						
						
						* fix complex power and print
* fix complex matmul shape 
						
						
					 
					
						2025-06-13 11:13:00 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Suryash Malviya 
							
						 
					 
					
						
						
							
						
						0408ba0a76 
					 
					
						
						
							
							Optimizing Complex Matrix Multiplication using Karatsuba’s Algorithm  ( #2220 )  
						
						 
						
						... 
						
						
						
						* Implementing Complex Matmul using Karatsuba Algorithm
* Implemented Karatsuba's Algorithm for complex matmul and pre-commit them
* fix
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-06-02 15:58:46 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8f3d208dce 
					 
					
						
						
							
							Close a couple edge case bugs: hadamard and addmm on empty inputs ( #2177 )  
						
						 
						
						... 
						
						
						
						* handle hadamard and addmm on empty inputs
* fix 
						
						
					 
					
						2025-05-12 10:48:57 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						99eefd2ec0 
					 
					
						
						
							
							Gather mm new kernel and small refactoring ( #2040 )  
						
						 
						
						
						
						
					 
					
						2025-04-14 16:37:36 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f2c85308c1 
					 
					
						
						
							
							add a half simd gemm fallback ( #2046 )  
						
						 
						
						... 
						
						
						
						* add a half simd gemm fallback
* nit 
						
						
					 
					
						2025-04-07 09:31:29 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						98b901ad66 
					 
					
						
						
							
							enable complex gemm ( #2017 )  
						
						 
						
						
						
						
					 
					
						2025-03-28 10:45:13 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c6ea2ba329 
					 
					
						
						
							
							Use same accumulation precision in gemv as gemm ( #1962 )  
						
						 
						
						... 
						
						
						
						* use same accumulation precision in gemv as gemm
* faster
* fix compile 
						
						
					 
					
						2025-03-16 07:13:24 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						2d6cd47713 
					 
					
						
						
							
							Masked gemv ( #1211 )  
						
						 
						
						
						
						
					 
					
						2024-06-14 09:52:26 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Nikhil Mehta 
							
						 
					 
					
						
						
							
						
						0b7d71fd2f 
					 
					
						
						
							
							Add softmin, hardshrink, hardtanh ( #1180 )  
						
						 
						
						... 
						
						
						
						---------
Co-authored-by: Nikhil Mehta <nikmehta@tesla.com > 
						
						
					 
					
						2024-06-04 15:48:18 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						9f0df51f8d 
					 
					
						
						
							
							Fix matvec vector stride bug ( #1168 )  
						
						 
						
						
						
						
					 
					
						2024-05-29 12:18:28 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7e26fd8032 
					 
					
						
						
							
							Option to JIT steel gemm / conv ( #1139 )  
						
						 
						
						
						
						
					 
					
						2024-05-23 18:07:34 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						eab2685c67 
					 
					
						
						
							
							Float mask update ( #1152 )  
						
						 
						
						... 
						
						
						
						* Float mask update
* Update CPU impl 
						
						
					 
					
						2024-05-23 17:20:44 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d568c7ee36 
					 
					
						
						
							
							Rename block sparse ( #1149 )  
						
						 
						
						... 
						
						
						
						* block_sparse_mm to gather_mm
* rename
* nit
* nit 
						
						
					 
					
						2024-05-22 07:48:34 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						f390957685 
					 
					
						
						
							
							Block sparse mm ( #1058 )  
						
						 
						
						
						
						
					 
					
						2024-05-02 14:03:58 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						85c8a91a27 
					 
					
						
						
							
							Fix mask broadcasting bug and add relevant test ( #1003 )  
						
						 
						
						
						
						
					 
					
						2024-04-17 17:33:48 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						b18468bf81 
					 
					
						
						
							
							Masked mm ( #978 )  
						
						 
						
						... 
						
						
						
						* Add block masked matmul op and primitive 
						
						
					 
					
						2024-04-16 14:45:39 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						5ad133f8bb 
					 
					
						
						
							
							No copy gems ( #801 )  
						
						 
						
						... 
						
						
						
						* Enable collapsing batch dims in gemm
* Update gemm to only make copies when neither of the last 2 axes are contiguous
* Update addmm to support gemv shapes
* Update addmm to support irregular batch strides
* Update tests 
						
						
					 
					
						2024-03-12 13:13:41 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d40a04f8dc 
					 
					
						
						
							
							minor fixes ( #631 )  
						
						 
						
						... 
						
						
						
						* minor fixes
* var with ddof >= nelements 
						
						
					 
					
						2024-02-05 13:27:49 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						09b9275027 
					 
					
						
						
							
							Make shape a tuple ( #591 )  
						
						 
						
						... 
						
						
						
						* shape tuple
* also remove simplify from docs
* rebase 
						
						
					 
					
						2024-01-30 13:11:01 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Juarez Bochi 
							
						 
					 
					
						
						
							
						
						ddf50113c5 
					 
					
						
						
							
							GGUF: Load and save metadata ( #446 )  
						
						 
						
						... 
						
						
						
						* gguf metadata
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-01-19 14:06:05 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						78102a47ad 
					 
					
						
						
							
							Update GEMM ( #424 )  
						
						 
						
						... 
						
						
						
						* Organize and collect metal subroutine templates and elements in `metal/kernels/steel/`
* Update gemm elements for better performance 
* Add split-K specialization for gemm
* Add `addmm` primitive, op and bindings for fused matmul and bias addition 
* Update tests and benchmarks as needed 
						
						
					 
					
						2024-01-17 12:42:39 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b9e415d19c 
					 
					
						
						
							
							bump pre commit and fix format ( #373 )  
						
						 
						
						
						
						
					 
					
						2024-01-04 16:28:52 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Josh Soref 
							
						 
					 
					
						
						
							
						
						44c1ce5e6a 
					 
					
						
						
							
							Spelling ( #342 )  
						
						 
						
						... 
						
						
						
						* spelling: accumulates
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: across
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: additional
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: against
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: among
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: array
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: at least
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: available
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: axes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: basically
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: bfloat
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: bounds
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: broadcast
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: buffer
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: class
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: coefficients
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: collision
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: combinations
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: committing
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: computation
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: consider
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: constructing
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: conversions
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: correctly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: corresponding
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: declaration
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: default
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: dependency
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: destination
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: destructor
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: dimensions
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: divided
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: element-wise
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: elements
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: endianness
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: equivalent
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: explicitly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: github
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: indices
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: irregularly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: memory
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: metallib
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: negative
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: notable
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: optional
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: otherwise
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: overridden
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: partially
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: partition
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: perform
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: perturbations
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: positively
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: primitive
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: repeat
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: repeats
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: respect
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: respectively
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: result
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: rounding
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: separate
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: skipping
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: structure
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: the
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: transpose
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: unnecessary
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: unneeded
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
* spelling: unsupported
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com >
---------
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com > 
						
						
					 
					
						2024-01-01 21:08:17 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Zach Schillaci 
							
						 
					 
					
						
						
							
						
						5b9be57ac3 
					 
					
						
						
							
							Add isort pre-commit and run ( #68 )  
						
						 
						
						
						
						
					 
					
						2023-12-08 11:31:47 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						d518b3b6a5 
					 
					
						
						
							
							Fix gemv broadcasting bug ( #6 )  
						
						 
						
						... 
						
						
						
						* Fix broadcasting bug in gemv
* Add relevant tests in test_blas.py 
						
						
					 
					
						2023-12-05 14:15:43 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						46a39e5b1f 
					 
					
						
						
							
							copyright + ack  
						
						 
						
						
						
						
					 
					
						2023-11-30 11:12:53 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						e6306cfee9 
					 
					
						
						
							
							jagrit's commit files  
						
						 
						
						
						
						
					 
					
						2023-11-29 10:52:08 -08:00