Piotr Rybiec 
							
						 
					 
					
						
						
							
						
						581b699ac9 
					 
					
						
						
							
							avgpool, not maxpool ( #1002 )  
						
						
						
						
							
						
					 
					
						2024-04-17 08:26:22 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8a0677d56d 
					 
					
						
						
							
							Shared events for synchronization + async eval ( #998 )  
						
						... 
						
						
						
						* more async eval
* fix rebase
* try correct async eval
* fix async
* more tests for async eval
* use shared events for synchronization
* comment + cleanup
* with autorelease pool
* fix no metal build
* fix compile
* fix patch
* don't eval if asyn evale'd
* don't use is_evaled
* comments
* more multi stream tests
* try and cleanup use of is_evaled
* use a status flag 
						
						
							
						
					 
					
						2024-04-17 06:16:02 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						b18468bf81 
					 
					
						
						
							
							Masked mm ( #978 )  
						
						... 
						
						
						
						* Add block masked matmul op and primitive 
						
						
							
						
					 
					
						2024-04-16 14:45:39 -07:00 
						 
				 
			
				
					
						
							
							
								Shiyu 
							
						 
					 
					
						
						
							
						
						107ba2891a 
					 
					
						
						
							
							gelu tanh approx ( #989 )  
						
						... 
						
						
						
						* gelu tanh approx
* gelu tanh approx
* replace gelu approx with tanh approach
* fix comments
* fix comment 
						
						
							
						
					 
					
						2024-04-15 19:49:00 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						cd9e184529 
					 
					
						
						
							
							Quantize embedding ( #994 )  
						
						... 
						
						
						
						* quantize embedding
* rename as_linear + comment
* consistency in docs
* fix test 
						
						
							
						
					 
					
						2024-04-15 16:42:10 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						2e7c02d5cd 
					 
					
						
						
							
							Metal FFT for powers of 2 up to 2048 ( #915 )  
						
						... 
						
						
						
						* add Metal FFT for powers of 2
* skip GPU test on linux
* fix contiguity bug
* address comments
* Update mlx/backend/metal/fft.cpp
* Update mlx/backend/metal/fft.cpp
* fix bug in synch
---------
Co-authored-by: Alex Barron <abarron22@apple.com >
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
							
						
					 
					
						2024-04-11 21:40:06 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ae18326533 
					 
					
						
						
							
							No copy command encoder ( #986 )  
						
						... 
						
						
						
						* no copy command encoder
* up layer norm test tolerances 
						
						
							
						
					 
					
						2024-04-11 21:15:36 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Shepard 
							
						 
					 
					
						
						
							
						
						91eba8e485 
					 
					
						
						
							
							fix for grammatical typo in docs ( #988 )  
						
						... 
						
						
						
						thanks for mlx! 
						
						
							
						
					 
					
						2024-04-11 17:02:06 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d07e295c62 
					 
					
						
						
							
							bumpity bump ( #987 )  
						
						
						
						
							
 
						
					 
					
						2024-04-11 12:48:52 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						dce4bd74a4 
					 
					
						
						
							
							Add ArrayDesc destructor to avoid possible stack overflow ( #982 )  
						
						
						
						
							
						
					 
					
						2024-04-11 11:37:02 -07:00 
						 
				 
			
				
					
						
							
							
								Nripesh Niketan 
							
						 
					 
					
						
						
							
						
						ffff671273 
					 
					
						
						
							
							Update pre-commit hooks ( #984 )  
						
						
						
						
							
						
					 
					
						2024-04-11 07:27:53 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						12d4507ee3 
					 
					
						
						
							
							Explicit barriers with concurrent dispatch ( #977 )  
						
						
						
						
							
						
					 
					
						2024-04-10 21:45:31 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8580d997ff 
					 
					
						
						
							
							Try a stack-based DFS for eval ( #980 )  
						
						... 
						
						
						
						* rebase
* nit
* fix eval in vmap 
						
						
							
						
					 
					
						2024-04-10 17:05:13 -07:00 
						 
				 
			
				
					
						
							
							
								Shiyu 
							
						 
					 
					
						
						
							
						
						061cf9a4ce 
					 
					
						
						
							
							Upsample with bicubic interpolation ( #967 )  
						
						
						
						
							
						
					 
					
						2024-04-10 15:47:22 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						99abb9eff4 
					 
					
						
						
							
							Async eval ( #972 )  
						
						
						
						
							
						
					 
					
						2024-04-09 18:34:00 -07:00 
						 
				 
			
				
					
						
							
							
								Luca Arnaboldi 
							
						 
					 
					
						
						
							
						
						fffe072028 
					 
					
						
						
							
							Implementation of mlx.random.multivariate_normal ( #502 ) ( #877 )  
						
						... 
						
						
						
						* Implementation of mlx.random.multivariate_normal (#502 )
* Update python/src/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Updated typo in docstring
* Restricted multivariate_normal to  float32
* Generic mean and variance shapes
* Review edits
* Update mlx/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/random.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Test for ndim of mean and cov
* nits
* smaller size for test
* fix broadcasted sampling
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
							
						
					 
					
						2024-04-09 13:50:12 -07:00 
						 
				 
			
				
					
						
							
							
								Abe Leininger 
							
						 
					 
					
						
						
							
						
						a1a31eed27 
					 
					
						
						
							
							Add mx.meshgrid ( #961 )  
						
						
						
						
							
						
					 
					
						2024-04-09 11:43:08 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ae812350f9 
					 
					
						
						
							
							use string ( #976 )  
						
						
						
						
							
						
					 
					
						2024-04-09 11:22:00 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b63ef10a7f 
					 
					
						
						
							
							Extensions ( #962 )  
						
						... 
						
						
						
						* start to fix extensions
* mostly fixed extensions
* fix extension build
* couple more nits 
						
						
							
						
					 
					
						2024-04-09 08:50:36 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						42afe27e12 
					 
					
						
						
							
							std and expm1 ( #973 )  
						
						... 
						
						
						
						* std and expm1
* actually add expm1
* fix linux
* fix vjp
* relax tol for linux test
* Add it to the compilable primitives
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
							
						
					 
					
						2024-04-08 14:26:01 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						76e63212ff 
					 
					
						
						
							
							Enable bfloat scan ( #974 )  
						
						... 
						
						
						
						* enable bfloat scan
* fix tests 
						
						
							
						
					 
					
						2024-04-08 12:29:19 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						aac2f9fb61 
					 
					
						
						
							
							Improve profiling with gpu tracing ( #969 )  
						
						... 
						
						
						
						* improve profiling with gpu tracing
* fix for linux
* nit
* doc fix
* fix example 
						
						
							
						
					 
					
						2024-04-07 21:47:43 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bddf23f175 
					 
					
						
						
							
							patch bump ( #956 )  
						
						
						
						
							
 
						
					 
					
						2024-04-04 11:56:37 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						039da779d1 
					 
					
						
						
							
							No quant reshape ( #957 )  
						
						... 
						
						
						
						* precise option on cpu
* remove print
* remove reshape in quant matmul
* no quant reshape 
						
						
							
						
					 
					
						2024-04-04 11:52:12 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d88d2124b5 
					 
					
						
						
							
							segfaut layer norm grad ( #955 )  
						
						
						
						
							
						
					 
					
						2024-04-04 10:59:15 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						e142aaf8a1 
					 
					
						
						
							
							Option for precise softmax ( #953 )  
						
						... 
						
						
						
						* precise softmax
* Add an equivalency check
* Make the threadgroup memory definition fixed
* precise cpu softmax
* precise option on cpu
* remove print
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
							
						
					 
					
						2024-04-04 08:32:35 -07:00 
						 
				 
			
				
					
						
							
							
								AmirHossein_Razlighi 
							
						 
					 
					
						
						
							
						
						0caf35f4b8 
					 
					
						
						
							
							Better exceptions in case of invalid operations on mlx.core.array ( #910 ) ( #926 )  
						
						... 
						
						
						
						* Nicer exceptions for ops on non-arrays 
						
						
							
						
					 
					
						2024-04-02 21:11:24 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						3fc993f82d 
					 
					
						
						
							
							Properly handle negative axes in python vmap ( #944 )  
						
						
						
						
							
						
					 
					
						2024-04-02 18:07:23 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						741eb28443 
					 
					
						
						
							
							fix a couple bugs ( #952 )  
						
						
						
						
							
						
					 
					
						2024-04-02 12:07:41 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						1a87dc5ea8 
					 
					
						
						
							
							Fix compile fusion for multi-output edge cases ( #950 )  
						
						... 
						
						
						
						* Fix compile fusion for multi-output edge cases
* Add a test for multi-output compile 
						
						
							
						
					 
					
						2024-04-02 08:42:31 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						2427fa171e 
					 
					
						
						
							
							Fix cpu compile ( #934 )  
						
						... 
						
						
						
						* fix one cpu bug, test for another
* format hooks
* simplify contiguity check for cpu compile
* fix
* add back donation
* comment 
						
						
							
						
					 
					
						2024-04-01 17:37:12 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						639e06e1f3 
					 
					
						
						
							
							Indexing bug fix ( #947 )  
						
						... 
						
						
						
						* Fix axes accounting
* Add tests 
						
						
							
						
					 
					
						2024-04-01 12:18:50 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						02fedbf1da 
					 
					
						
						
							
							Fix array initialization from list ( #942 )  
						
						... 
						
						
						
						* Fix array initialization from list
* Change the error message in the test 
						
						
							
						
					 
					
						2024-04-01 06:27:52 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						110d9b149d 
					 
					
						
						
							
							Layer norm grad fix donation bug ( #941 )  
						
						... 
						
						
						
						* add layer norm grad test
* Fix donation bug in layernorm vjp
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
							
						
					 
					
						2024-04-01 06:15:50 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						9cbff5ec1d 
					 
					
						
						
							
							Fix typo in qmm check ( #940 )  
						
						
						
						
							
						
					 
					
						2024-03-31 19:15:44 -07:00 
						 
				 
			
				
					
						
							
							
								Suvan Kumar 
							
						 
					 
					
						
						
							
						
						433c0206b0 
					 
					
						
						
							
							Update saving_and_loading.rst ( #929 )  
						
						... 
						
						
						
						Update saving / load docs. 
						
						
							
						
					 
					
						2024-03-30 14:30:06 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8915901966 
					 
					
						
						
							
							Donation bug ( #933 )  
						
						... 
						
						
						
						* donation
* buf
* fix bug in softmax
* comment
* remove print 
						
						
							
						
					 
					
						2024-03-30 10:08:54 -07:00 
						 
				 
			
				
					
						
							
							
								AmirHossein_Razlighi 
							
						 
					 
					
						
						
							
						
						f48bc496c7 
					 
					
						
						
							
							Comparing python objects (such as list/tuple) with mlx.core.array ( #920 )  
						
						... 
						
						
						
						* add implicit conversion of list to array for equality constraint
* add tests for array equality
* add test for tuple and array equality
* return False if __eq__ arg is list or tuple
* write tests for equality
* update the rule of comparison for __ge__/__gt__/__lt__/__le__
* add a helper function for detecting mlx.core.array
* return true in case fo inequality
* debug minor issue regarding detecting mlx array
* add tests for inequality comparisons
* add name for contribution
* reformat files using pre-commit
* update tests for float
* update tests for inequality
* raise exception in case of invalid comparisons
* use isinstance instead of string comparison
* replace "is_convirtable_to_array" with previous logic
* remove throwing exceptions for other operations
* just a comment
* minor changes for efficiency
* optimize a utils function
* change the function name
* Update ACKNOWLEDGMENTS.md
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com > 
						
						
							
						
					 
					
						2024-03-29 06:52:30 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						913b19329c 
					 
					
						
						
							
							Add missing && when forwarding args ( #925 )  
						
						... 
						
						
						
						Without the && args would be copied and perfect forwarding won't work. 
						
						
							
						
					 
					
						2024-03-29 06:48:29 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d8cb3128f6 
					 
					
						
						
							
							bump ( #924 )  
						
						... 
						
						
						
						* bump
* fix version 
						
						
							
 
						
					 
					
						2024-03-28 16:14:55 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						5f9ba3019f 
					 
					
						
						
							
							Fix qmm_t for unaligned cases ( #923 )  
						
						
						
						
							
						
					 
					
						2024-03-28 15:34:57 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						46caf0bef0 
					 
					
						
						
							
							Remove unnecessary string copies ( #891 )  
						
						... 
						
						
						
						1. Use string_view instead of string when there is no need for copy.
2. Otherwise move string when possible. 
						
						
							
						
					 
					
						2024-03-28 13:14:59 -07:00 
						 
				 
			
				
					
						
							
							
								Jack Mousseau 
							
						 
					 
					
						
						
							
						
						45f636e759 
					 
					
						
						
							
							Add Metal debug option and capture functions ( #707 )  
						
						... 
						
						
						
						* Add Metal debug option and capture functions
* Add brief Metal debugger documentation
* doc nits
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
							
						
					 
					
						2024-03-28 09:40:31 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						a7b404ff53 
					 
					
						
						
							
							Use uintptr_t instead of size_t to store funtion id ( #916 )  
						
						... 
						
						
						
						Also does some small cleanup of the compile cache code. 
						
						
							
						
					 
					
						2024-03-28 06:37:59 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						c4fd0e5ede 
					 
					
						
						
							
							Fixes   #918  bug in compile_tests ( #919 )  
						
						
						
						
							
						
					 
					
						2024-03-27 22:37:37 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						bab5386306 
					 
					
						
						
							
							Make ops aware of rvalues: astype/as_strided/copy/full ( #895 )  
						
						... 
						
						
						
						When compositing transforms lots of temporary of arrays will be created
and passed to next primitive, and by making ops accepting args by value
we can avoid lots of copies of temporary arrays. 
						
						
							
						
					 
					
						2024-03-27 22:35:55 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						aca7584635 
					 
					
						
						
							
							Fix OOB read in qmv when non-divisible by blocksize ( #917 )  
						
						
						
						
							
						
					 
					
						2024-03-27 22:18:35 -07:00 
						 
				 
			
				
					
						
							
							
								AmirHossein_Razlighi 
							
						 
					 
					
						
						
							
						
						d611251502 
					 
					
						
						
							
							Support Chaining for some of functionalities of nn.Module ( #885 ) ( #897 )  
						
						... 
						
						
						
						* add chaining support for some of the functionalities of "nn.Module"
* reformat
* change the return types
* remove return types
* add return type with forward referencing
* add tests for chaining
* add name to contributors
* Update python/mlx/nn/layers/base.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/mlx/nn/layers/base.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* update docstring
* update docstrings
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com > 
						
						
							
						
					 
					
						2024-03-27 19:58:29 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						f30b659291 
					 
					
						
						
							
							Make MLX build on x64 macOS ( #901 )  
						
						... 
						
						
						
						The arm64 macbook pros are heavy and I usually care my intel one for
mobile, it would be nice if I can play with MLX on it.
To build with x64, user must pass `MLX_ENABLE_X64_MAC` to cmake:
CMAKE_ARGS='-DMLX_ENABLE_X64_MAC=ON' python setup.py 
						
						
							
						
					 
					
						2024-03-27 06:14:29 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						90dfa43ff1 
					 
					
						
						
							
							Don't use make_unique to create shared_ptr ( #902 )  
						
						... 
						
						
						
						The code compiled because shared_ptr's constructor actually accepts
unique_ptr. 
						
						
							
						
					 
					
						2024-03-27 06:13:29 -07:00