Jagrit Digani 
							
						 
					 
					
						
						
							
						
						3290bfa690 
					 
					
						
						
							
							Add new sdpa function overload ( #2035 )  
						
						... 
						
						
						
						* Add new sdpa function overload
* Address comments
* Remove std::varaint from cpp sdpa function 
						
						
							
						
					 
					
						2025-04-03 11:58:28 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						8777fd104f 
					 
					
						
						
							
							Depthwise Conv2D optimization ( #2036 )  
						
						... 
						
						
						
						- Add new specialized kernel for small kernel (kernels size <= 7), small strides (strides <= 2) depthwise 2d convolutions
- Add related tests 
						
						
							
						
					 
					
						2025-04-03 09:42:04 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c41f7565ed 
					 
					
						
						
							
							fix softmax / logsumexp ( #2042 )  
						
						
						
						
							
						
					 
					
						2025-04-03 08:32:59 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9ba81e3da4 
					 
					
						
						
							
							tune quant dispatch ( #2031 )  
						
						
						
						
							
						
					 
					
						2025-04-02 20:05:54 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c23888acd7 
					 
					
						
						
							
							Fix build warning ( #2033 )  
						
						
						
						
							
						
					 
					
						2025-04-01 14:42:27 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f98ce25ab9 
					 
					
						
						
							
							fix residency set for real ( #2032 )  
						
						
						
						
							
						
					 
					
						2025-04-01 12:59:48 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						de5f38fd48 
					 
					
						
						
							
							Custom logsumexp ( #2028 )  
						
						... 
						
						
						
						* initial custom logsumexp
* more tests
* comments + fix 
						
						
							
						
					 
					
						2025-03-31 07:36:55 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ec2854b13a 
					 
					
						
						
							
							Swap -inf for finite_minimum value ( #2029 )  
						
						
						
						
							
						
					 
					
						2025-03-30 21:55:04 -07:00 
						 
				 
			
				
					
						
							
							
								Stephen Panaro 
							
						 
					 
					
						
						
							
						
						90823d2938 
					 
					
						
						
							
							Add missing funcs to docs ( #2021 )  
						
						
						
						
							
						
					 
					
						2025-03-30 18:29:33 -07:00 
						 
				 
			
				
					
						
							
							
								Jesper Stemann Andersen 
							
						 
					 
					
						
						
							
						
						5f5770e3a2 
					 
					
						
						
							
							Fix CPU sign for unsigned ints ( #2024 )  
						
						... 
						
						
						
						Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
							
						
					 
					
						2025-03-30 17:56:59 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						28f39e9038 
					 
					
						
						
							
							Log for complex numbers in Metal ( #2025 )  
						
						... 
						
						
						
						* Log for complex numbers in Metal
* fix log2 
						
						
							
						
					 
					
						2025-03-30 17:04:38 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b2d2b37888 
					 
					
						
						
							
							fix residency set clearing ( #2027 )  
						
						
						
						
							
						
					 
					
						2025-03-30 16:27:26 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						fe597e141c 
					 
					
						
						
							
							add pinv to doc ( #2020 )  
						
						
						
						
							
						
					 
					
						2025-03-30 15:54:18 -07:00 
						 
				 
			
				
					
						
							
							
								Yi Wang 
							
						 
					 
					
						
						
							
						
						72ca1539e0 
					 
					
						
						
							
							Remove unused variable in  /setup.py ( #2026 )  
						
						... 
						
						
						
						This is a follow up of https://github.com/ml-explore/mlx/pull/2011  
						
						
							
						
					 
					
						2025-03-30 12:52:33 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						13b26775f1 
					 
					
						
						
							
							use minimum deployment target ( #2016 )  
						
						
						
						
							
						
					 
					
						2025-03-28 14:31:53 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						05d7118561 
					 
					
						
						
							
							causal vector sdpa ( #2018 )  
						
						... 
						
						
						
						* causal vector sdpa
* get rid of memory threshold 
						
						
							
						
					 
					
						2025-03-28 12:36:13 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						98b901ad66 
					 
					
						
						
							
							enable complex gemm ( #2017 )  
						
						
						
						
							
						
					 
					
						2025-03-28 10:45:13 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						5580b47291 
					 
					
						
						
							
							iinfo and scalar overflow detection ( #2009 )  
						
						
						
						
							
						
					 
					
						2025-03-27 19:54:56 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bc62932984 
					 
					
						
						
							
							sdpa specialization for head dim 256 ( #2007 )  
						
						
						
						
							
						
					 
					
						2025-03-27 19:31:25 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						a6b5d6e759 
					 
					
						
						
							
							revise cmake minimum for doctest ( #2014 )  
						
						
						
						
							
						
					 
					
						2025-03-27 19:30:58 -07:00 
						 
				 
			
				
					
						
							
							
								Yi Wang 
							
						 
					 
					
						
						
							
						
						a8931306e1 
					 
					
						
						
							
							Remove unused variable in CMakeBuild ( #2011 )  
						
						... 
						
						
						
						Fix https://github.com/ml-explore/mlx/issues/2010  
						
						
							
						
					 
					
						2025-03-27 16:00:51 -07:00 
						 
				 
			
				
					
						
							
							
								Yi Wang 
							
						 
					 
					
						
						
							
						
						fecdb8717e 
					 
					
						
						
							
							Polish CONTRIBUTING>md ( #2005 )  
						
						
						
						
							
						
					 
					
						2025-03-25 19:06:34 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						916fd273ea 
					 
					
						
						
							
							wire cache ( #2006 )  
						
						
						
						
							
						
					 
					
						2025-03-25 18:54:01 -07:00 
						 
				 
			
				
					
						
							
							
								Yi Wang 
							
						 
					 
					
						
						
							
						
						0da8506552 
					 
					
						
						
							
							Update docs for extensions ( #2004 )  
						
						
						
						
							
						
					 
					
						2025-03-25 18:35:03 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						eda7a7b43e 
					 
					
						
						
							
							Do not join threads during process exit on Windows ( #1738 )  
						
						
						
						
							
						
					 
					
						2025-03-25 06:33:08 -07:00 
						 
				 
			
				
					
						
							
							
								Chunyang Wen 
							
						 
					 
					
						
						
							
						
						022eabb734 
					 
					
						
						
							
							Remove unused import ( #1987 )  
						
						
						
						
							
						
					 
					
						2025-03-24 20:19:32 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						aba899cef8 
					 
					
						
						
							
							patch bump ( #2000 )  
						
						
						
						
							
 
						
					 
					
						2025-03-24 12:47:05 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						6a40e1c176 
					 
					
						
						
							
							Fix looping limit in causal attention ( #1999 )  
						
						
						
						
							
						
					 
					
						2025-03-24 12:28:00 -07:00 
						 
				 
			
				
					
						
							
							
								Jesper Stemann Andersen 
							
						 
					 
					
						
						
							
						
						9307b2ab8b 
					 
					
						
						
							
							Fixed 32-bit platform support for distributed/ring implementation ( #1996 )  
						
						... 
						
						
						
						Replaced unsigned long integer literals with size_t literals in ring implementation, e.g., 1UL with size_t(1). 
						
						
							
						
					 
					
						2025-03-24 08:08:40 -07:00 
						 
				 
			
				
					
						
							
							
								Jesper Stemann Andersen 
							
						 
					 
					
						
						
							
						
						522d8d3917 
					 
					
						
						
							
							Added missing netinet/in.h include that fixes build on FreeBSD ( #1997 )  
						
						... 
						
						
						
						Defines IPPROTO_TCP. 
						
						
							
						
					 
					
						2025-03-24 08:07:34 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						a84cc0123f 
					 
					
						
						
							
							promote mask when needed ( #1998 )  
						
						
						
						
							
						
					 
					
						2025-03-23 19:58:28 -07:00 
						 
				 
			
				
					
						
							
							
								Andrey Velichkevich 
							
						 
					 
					
						
						
							
						
						f018e248cd 
					 
					
						
						
							
							fix(backend): Include algorithm library in Allocator ( #1992 )  
						
						... 
						
						
						
						Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com > 
						
						
							
						
					 
					
						2025-03-22 21:27:51 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						cfd7237a80 
					 
					
						
						
							
							fix docs ( #1991 )  
						
						
						
						
							
						
					 
					
						2025-03-21 19:58:53 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						4eef8102c9 
					 
					
						
						
							
							Distributed layers ( #1270 )  
						
						
						
						
							
						
					 
					
						2025-03-21 13:52:17 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						69e4dd506b 
					 
					
						
						
							
							Add a ring all gather ( #1985 )  
						
						
						
						
							
						
					 
					
						2025-03-21 13:36:51 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						25814a9458 
					 
					
						
						
							
							Disable mpi on version mismatch ( #1989 )  
						
						
						
						
							
						
					 
					
						2025-03-21 13:36:26 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						2a980a76ce 
					 
					
						
						
							
							Add stats and limit to common allocator and enable tests ( #1988 )  
						
						... 
						
						
						
						* add stats to common allocator and enable tests
* linux memory and default
* fix 
						
						
							
						
					 
					
						2025-03-21 12:28:36 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						d343782c8b 
					 
					
						
						
							
							Cross platform libmpi loading ( #1975 )  
						
						
						
						
							
						
					 
					
						2025-03-21 11:23:10 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4e1994e9d7 
					 
					
						
						
							
							move memory APIs into top level mlx.core ( #1982 )  
						
						
						
						
							
						
					 
					
						2025-03-21 07:25:12 -07:00 
						 
				 
			
				
					
						
							
							
								jiyzhang 
							
						 
					 
					
						
						
							
						
						65a38c452b 
					 
					
						
						
							
							update the formula of smooth_l1_loss ( #1986 )  
						
						
						
						
							
						
					 
					
						2025-03-21 06:25:23 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7b7e2352cd 
					 
					
						
						
							
							fix malloc or wait deadlock ( #1976 )  
						
						
						
						
							
						
					 
					
						2025-03-20 16:48:43 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1177d28395 
					 
					
						
						
							
							patch bump ( #1981 )  
						
						
						
						
							
 
						
					 
					
						2025-03-20 15:12:22 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						005e7efa64 
					 
					
						
						
							
							fix mask in sdpa ( #1980 )  
						
						... 
						
						
						
						* fix mask in sdpa
* fix attention mask
* Re-enable routing for array mask
---------
Co-authored-by: Jagrit Digani <digani@apple.com > 
						
						
							
						
					 
					
						2025-03-20 14:53:12 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						b42d13ec84 
					 
					
						
						
							
							Update attention tests to show diff, disable array masks ( #1978 )  
						
						
						
						
							
						
					 
					
						2025-03-20 14:25:38 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						9adcd1a650 
					 
					
						
						
							
							Support fused masking in Attention ( #1924 )  
						
						... 
						
						
						
						* Update API to allow mask='causal' in fast::sdpa
* Add fallback
* Update steel::AttnParams
* Fix typo
* WIP, basic causal
* Update tests
* Update benchmarking
* Update masking loop limits
* Add bool masking and update tests
* Update additive mask
* Update benchmarks
* Update benchmarks
* Update tests
* Update for bfloat error
* Update early exit
* Add random seed to tests 
						
						
							
						
					 
					
						2025-03-20 11:01:32 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						3c164fca8c 
					 
					
						
						
							
							Fix multistream GPU deadlock ( #1969 )  
						
						... 
						
						
						
						* fix multistream GPU deadlock
* comments 
						
						
							
						
					 
					
						2025-03-20 07:19:47 -07:00 
						 
				 
			
				
					
						
							
							
								jiyzhang 
							
						 
					 
					
						
						
							
						
						95e335db7b 
					 
					
						
						
							
							Update smooth_l1_loss in losses.py ( #1974 )  
						
						... 
						
						
						
						According the definition of smooth_l1_loss, the line 
diff = predictions - targets
Should be updated to 
diff = mx.abs(predictions - targets)
After the modification, the result is consistent with PyTorch smooth_l1_loss 
						
						
							
						
					 
					
						2025-03-19 20:19:02 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f90206ad74 
					 
					
						
						
							
							Guard nullptr dereference ( #1972 )  
						
						... 
						
						
						
						* guard nullptr dereference
* comment 
						
						
							
						
					 
					
						2025-03-19 16:24:10 -07:00 
						 
				 
			
				
					
						
							
							
								Chunyang Wen 
							
						 
					 
					
						
						
							
						
						3779150750 
					 
					
						
						
							
							refactor: all use schedule ( #1973 )  
						
						
						
						
							
						
					 
					
						2025-03-19 11:24:04 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						0a9777aa5c 
					 
					
						
						
							
							Do not define MLX_VERSION globally ( #1966 )  
						
						
						
						
							
						
					 
					
						2025-03-18 07:12:40 -07:00