Awni Hannun 
							
						 
					 
					
						
						
							
						
						19ec023256 
					 
					
						
						
							
							vmap matmul and admm ( #836 )  
						
						
						
						
							
						
					 
					
						2024-03-14 14:38:22 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						63ab0ab580 
					 
					
						
						
							
							version ( #835 )  
						
						
						
						
							
 
						
					 
					
						2024-03-14 12:20:40 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						8dfc376c00 
					 
					
						
						
							
							Strided reduce specialization for small reductions ( #826 )  
						
						... 
						
						
						
						* Add small column / general reduction specialization 
						
						
							
						
					 
					
						2024-03-14 09:16:53 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						1efee9db09 
					 
					
						
						
							
							Add types and order in kernel name ( #831 )  
						
						
						
						
							
						
					 
					
						2024-03-13 20:34:06 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						43abc402d8 
					 
					
						
						
							
							route to fallback ( #828 )  
						
						
						
						
							
						
					 
					
						2024-03-13 19:56:04 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						3f8b1668c4 
					 
					
						
						
							
							Make reshape faster for row_contiguous cases ( #829 )  
						
						
						
						
							
						
					 
					
						2024-03-13 16:22:03 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						76c919b4ec 
					 
					
						
						
							
							NumberOfElements for shapeless compile and vmap fixes ( #802 )  
						
						
						
						
							
						
					 
					
						2024-03-13 10:34:14 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						29d0c10ee5 
					 
					
						
						
							
							Reshape improvement ( #818 )  
						
						
						
						
							
						
					 
					
						2024-03-12 17:54:31 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						5ad133f8bb 
					 
					
						
						
							
							No copy gems ( #801 )  
						
						... 
						
						
						
						* Enable collapsing batch dims in gemm
* Update gemm to only make copies when neither of the last 2 axes are contiguous
* Update addmm to support gemv shapes
* Update addmm to support irregular batch strides
* Update tests 
						
						
							
						
					 
					
						2024-03-12 13:13:41 -07:00 
						 
				 
			
				
					
						
							
							
								nicolov 
							
						 
					 
					
						
						
							
						
						d0c544a868 
					 
					
						
						
							
							Add SVD primitive ( #809 )  
						
						... 
						
						
						
						Add SVD op using Accelerate's LAPACK following
https://developer.apple.com/documentation/accelerate/ 
compressing_an_image_using_linear_algebra
Co-authored-by: Nicolo Valigi <nvaligi@apple.com > 
						
						
							
						
					 
					
						2024-03-12 12:30:11 -07:00 
						 
				 
			
				
					
						
							
							
								Daniel Falbel 
							
						 
					 
					
						
						
							
						
						ffb19df3c0 
					 
					
						
						
							
							Fix docstring for correctly rendering ( #820 )  
						
						
						
						
							
						
					 
					
						2024-03-12 11:46:44 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8b7532b9ab 
					 
					
						
						
							
							fix scatter ( #821 )  
						
						
						
						
							
						
					 
					
						2024-03-12 11:42:07 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						366478c560 
					 
					
						
						
							
							fix modules with dict ( #819 )  
						
						
						
						
							
						
					 
					
						2024-03-12 08:54:06 -07:00 
						 
				 
			
				
					
						
							
							
								Justin Deschenaux 
							
						 
					 
					
						
						
							
						
						8e5600022a 
					 
					
						
						
							
							Implement RNN, GRU, LSTM ( #268 )  
						
						... 
						
						
						
						* RNN base implementation
* Address comments+format
* nits in docs
* add tests for prb
* fix test
* add a couple tests
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
							
						
					 
					
						2024-03-11 21:14:44 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0e95b64942 
					 
					
						
						
							
							Fix bug in tape order during simplify ( #816 )  
						
						... 
						
						
						
						* fix bug in tape order during simplify
* properly fix compile
* last bug 
						
						
							
						
					 
					
						2024-03-11 17:29:05 -07:00 
						 
				 
			
				
					
						
							
							
								nicolov 
							
						 
					 
					
						
						
							
						
						0ae22b915b 
					 
					
						
						
							
							Remove code duplication in reduce ops ( #793 )  
						
						... 
						
						
						
						* Remove code duplication in reduce ops
* Remove the unnecessary lambda
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
							
						
					 
					
						2024-03-11 10:57:07 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7c441600fe 
					 
					
						
						
							
							Compile stride bug ( #812 )  
						
						... 
						
						
						
						* fix compile stride bug
* revert sdpa fix
* fix cpu
* fix bug with simplifying outputs 
						
						
							
						
					 
					
						2024-03-11 06:31:31 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						a4d290adb9 
					 
					
						
						
							
							Remove depth traversal ( #813 )  
						
						... 
						
						
						
						* no depth traversal
* counter outside loop 
						
						
							
						
					 
					
						2024-03-09 20:21:32 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						28301807c2 
					 
					
						
						
							
							Version bump and os error ( #807 )  
						
						
						
						
							
 
						
					 
					
						2024-03-07 13:57:58 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						74ed0974b3 
					 
					
						
						
							
							Support 13.0+ with xcode 14.3 ( #806 )  
						
						... 
						
						
						
						* Support 13.0+ with xcode 14.3
* revert revert 
						
						
							
						
					 
					
						2024-03-07 13:27:57 -08:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						ec8a4864fa 
					 
					
						
						
							
							Fix SDPA kernel bug on Mac OS 13.3 SDK ( #805 )  
						
						... 
						
						
						
						* Move sdpa kernel to allocate tgp mem statically and allow macOS 13.3 SDK builds
* Style 
						
						
							
						
					 
					
						2024-03-07 10:18:09 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b7588fd5d7 
					 
					
						
						
							
							fix inplace to not make a shallow copy ( #804 )  
						
						
						
						
							
						
					 
					
						2024-03-07 09:34:11 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f512b905c7 
					 
					
						
						
							
							Minimum xcode / sdk ( #800 )  
						
						... 
						
						
						
						* minimum xcode /sdk
* try multiple xcode versions in CI
* update python
* metal validation for python tests 
						
						
							
						
					 
					
						2024-03-07 08:19:43 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						afd5274049 
					 
					
						
						
							
							route to fallback for bfloat ( #794 )  
						
						
						
						
							
						
					 
					
						2024-03-06 15:39:12 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1074674e32 
					 
					
						
						
							
							Add a maximum graph depth ( #797 )  
						
						... 
						
						
						
						* add a maximum graph depth
* remember how to use C++ 
						
						
							
						
					 
					
						2024-03-06 15:39:00 -08:00 
						 
				 
			
				
					
						
							
							
								AlexCheema 
							
						 
					 
					
						
						
							
						
						7762e07fde 
					 
					
						
						
							
							Update function_transforms.rst ( #796 )  
						
						... 
						
						
						
						Fix typo in function_transforms.rst 
						
						
							
						
					 
					
						2024-03-06 12:03:37 -08:00 
						 
				 
			
				
					
						
							
							
								Luca Arnaboldi 
							
						 
					 
					
						
						
							
						
						cbefd9129e 
					 
					
						
						
							
							Implementation of pickle, copy and deepcopy for Python arrays ( #300  &  #367 ). ( #713 )  
						
						... 
						
						
						
						* Implemented pickling and copy for Python arrays(#300  & #367 )
* Fixing typos
* Pickle with NumPy arrays
* Pickle: workaround for bfloat16
* Revert "Pickle: workaround for bfloat16"
This reverts commit 25afe6bc09awni.hannun@gmail.com >
* Update python/tests/test_array.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/array.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/array.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* clang-format applied
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com > 
						
						
							
						
					 
					
						2024-03-06 08:02:41 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						e39bebe13e 
					 
					
						
						
							
							Fix reshaping of empty arrays ( #791 )  
						
						
						
						
							
						
					 
					
						2024-03-05 23:33:22 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						14b4e51a7c 
					 
					
						
						
							
							Improved quantized matrix vector product ( #786 )  
						
						
						
						
							
						
					 
					
						2024-03-05 17:32:19 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						cbcf44a4ca 
					 
					
						
						
							
							Some fixes in cache / thread safety ( #777 )  
						
						... 
						
						
						
						* some fixes in cache / thread safety
* speed up no cache case
* fix opt test
* optimizer docs
* otpimizer docs
* fix adafactor
* fix adafactor 
						
						
							
						
					 
					
						2024-03-05 13:30:50 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						859ae15a54 
					 
					
						
						
							
							Fix test ( #785 )  
						
						
						
						
							
						
					 
					
						2024-03-04 23:02:27 -08:00 
						 
				 
			
				
					
						
							
							
								Brian Keene 
							
						 
					 
					
						
						
							
						
						0787724c44 
					 
					
						
						
							
							Fast Inference SDPA op ( #735 )  
						
						... 
						
						
						
						* Fast Inference SDPA op
Implements metal shaders for:
o = mx.fast_inference_sdpa(queries, keys, values, scale, mask)
Supports fp16, fp32 dtypes; assumes d_k = 128.
Generic op support / prompt encoding supported via mlx primitives.
Metal implementation is for the inference use case only.
Majority of performance benefits appears to results from GQA & reduced
bandwidth requirements; there is approximate performance parity for the
MHA use case (from some measurements on M3 Max).
* Flush shared memory to zero before unprotected reads for (scores @ values)
* Move to fast:: namespace, address reviewer comments
... also attempt to revert formatter auto-change for files not relevant
to this change
* Shared memory flush to top of kernel
* Resolve compiler warnings
* Update python/src/fast.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/fast.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/fast.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/fast.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update docstring per PR feedback
* Softmax in higher precision, ...
* route to fallback for more use cases - batch size > 1, head_dim other
  than 128, etc.
* Address linux build failure
* Address other reviewer comments
* Remove extraneous eval_cpu function per review
---------
Co-authored-by: Atila Orhon <64497909+atiorh@users.noreply.github.com >
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
Co-authored-by: atila <atiorh@icloud.com > 
						
						
							
						
					 
					
						2024-03-04 21:06:11 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7b463ffb07 
					 
					
						
						
							
							Ios compile ( #784 )  
						
						... 
						
						
						
						* try to fix build for ios
* skip cpu compile
* fix namespace
* fix namespace
* Use CMake for platform specific cpu compile
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
							
						
					 
					
						2024-03-04 20:02:26 -08:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						6686e61ca4 
					 
					
						
						
							
							Reduce update ( #783 )  
						
						... 
						
						
						
						* Split reduction files to reduce compile times
* Add small and medium axis size specializations for row reductions
* Add non-row-reduction options for small and med kernels 
						
						
							
						
					 
					
						2024-03-04 19:09:51 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c096a77b9b 
					 
					
						
						
							
							revision bump ( #778 )  
						
						
						
						
							
 
						
					 
					
						2024-03-04 13:41:53 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						5121f028d9 
					 
					
						
						
							
							nice tensordot for mlx c ( #782 )  
						
						
						
						
							
						
					 
					
						2024-03-04 09:51:02 -08:00 
						 
				 
			
				
					
						
							
							
								Piotr Rybiec 
							
						 
					 
					
						
						
							
						
						6a665ea6ed 
					 
					
						
						
							
							Dilation for convolutional layers ( #766 )  
						
						... 
						
						
						
						* add dilation parameter to Conv1d layer
* space here too
* add conv1d dilation test
* add dilation parameter for Conv2d layer
* conv2d dilation test 
						
						
							
						
					 
					
						2024-03-04 06:43:00 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bc06cb9ff6 
					 
					
						
						
							
							Pickle + dtype fix for numpy conversion ( #763 )  
						
						... 
						
						
						
						* pickle + dtype fix for numpy conversion
* fix getattribute on Module base
* remove unused function
* fix tests
* add topk to ops
* fix doc 
						
						
							
						
					 
					
						2024-03-02 06:09:29 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						8e281c76c3 
					 
					
						
						
							
							Fix the top-k op ( #768 )  
						
						
						
						
							
						
					 
					
						2024-03-01 22:08:43 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d5964a2710 
					 
					
						
						
							
							bindings for memory info ( #761 )  
						
						... 
						
						
						
						* bindings for memory info
* update api
* keep cache low if requested
* fix default
* nit in ops error 
						
						
							
						
					 
					
						2024-03-01 19:51:58 -08:00 
						 
				 
			
				
					
						
							
							
								Ikko Eltociear Ashimine 
							
						 
					 
					
						
						
							
						
						cf3eb87e52 
					 
					
						
						
							
							Fix typo in transforms.cpp ( #764 )  
						
						... 
						
						
						
						occuring -> occurring 
						
						
							
						
					 
					
						2024-02-29 22:23:46 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ab3a466711 
					 
					
						
						
							
							bump ( #760 )  
						
						
						
						
							
 
						
					 
					
						2024-02-29 11:58:54 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4494970f47 
					 
					
						
						
							
							avoid nested closures in module ( #759 )  
						
						
						
						
							
						
					 
					
						2024-02-29 09:39:52 -08:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						776c3d226d 
					 
					
						
						
							
							Convolution update  ( #651 )  
						
						... 
						
						
						
						* Init steel conv and update Conv primitive
* Update slow CPU implementation to support flipping and input dilation winograd conv routing
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
							
						
					 
					
						2024-02-28 20:11:16 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f5f18b704f 
					 
					
						
						
							
							fix temporary bug ( #752 )  
						
						
						
						
							
						
					 
					
						2024-02-27 17:44:39 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						420ff2f331 
					 
					
						
						
							
							Add back compiled function signatures and docstrings ( #749 )  
						
						... 
						
						
						
						* try to add back compiled function signatures and docstrings
* add indentation to docstring 
						
						
							
						
					 
					
						2024-02-27 13:18:59 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						56ba3ec40e 
					 
					
						
						
							
							fix cpu compile on older OS ( #747 )  
						
						
						
						
							
						
					 
					
						2024-02-26 22:20:53 -08:00 
						 
				 
			
				
					
						
							
							
								Noah Kasmanoff 
							
						 
					 
					
						
						
							
						
						de3d2467a3 
					 
					
						
						
							
							Update: Fast GeLU Approximation ( #744 )  
						
						... 
						
						
						
						* add: fast gelu approx
* fix docs
* Update gelu_fast_approx function documentation
* Update python/mlx/nn/layers/activations.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* fix: test gelu
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com > 
						
						
							
						
					 
					
						2024-02-26 21:08:50 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						fe1dabf272 
					 
					
						
						
							
							Fix compile with non standard types ( #745 )  
						
						... 
						
						
						
						* refactor tree utils
* fix compile + tree code refactor
* Add an extra test
* add a few missing activations to docs
* hash structure
* Encode the full argument structure
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
							
						
					 
					
						2024-02-26 19:28:53 -08:00 
						 
				 
			
				
					
						
							
							
								Hinrik Snær Guðmundsson 
							
						 
					 
					
						
						
							
						
						08226ab491 
					 
					
						
						
							
							added atleast *args input support ( #710 )  
						
						... 
						
						
						
						* added atleast list(array) input support
* function overloading implemented
* Refactoring
* fixed formatting
* removed pos_only 
						
						
							
						
					 
					
						2024-02-26 11:17:59 -08:00