Awni Hannun
							
						 
					 | 
					
						
						
							
						
						c6a20b427a
					 | 
					
						
						
							
							Improve metal elementwise kernels (#2247)
						
						
						
						
						
						
						
						* improve metal elementwise kernels
* compile and copy
* fix jit 
						
						
							
						
					 | 
					
						2025-06-06 11:37:40 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						a5ac9244c4
					 | 
					
						
						
							
							fix linux linking error (#2248)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-06 10:41:51 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						c763fe1be0
					 | 
					
						
						
							
							default strict mode for module update and update_modules (#2239)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-05 15:27:02 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						52dc8c8cd5
					 | 
					
						
						
							
							Add profiler annotations in common primitives for CUDA backend (#2244)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-04 19:55:12 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						aede70e81d
					 | 
					
						
						
							
							Perf regression fix (#2243)
						
						
						
						
						
						
							
 v0.26.1
						
					 | 
					
						2025-06-03 17:55:12 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						85a8beb5e4
					 | 
					
						
						
							
							Avoid atomic updates across CPU/GPU in CUDA event (#2231)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-03 16:49:06 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						0bb89e9e5f
					 | 
					
						
						
							
							Share more common code in Compiled (#2240)
						
						
						
						
						
						
						
						* Share more common code in Compiled
* Remove build_lib_name 
						
						
							
						
					 | 
					
						2025-06-03 16:48:50 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						5685ceb3c7
					 | 
					
						
						
							
							Avoid invoking allocator::malloc when creating CUDA event (#2232)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-03 16:48:40 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Suryash Malviya
							
						 
					 | 
					
						
						
							
						
						0408ba0a76
					 | 
					
						
						
							
							Optimizing Complex Matrix Multiplication using Karatsuba’s Algorithm  (#2220)
						
						
						
						
						
						
						
						* Implementing Complex Matmul using Karatsuba Algorithm
* Implemented Karatsuba's Algorithm for complex matmul and pre-commit them
* fix
---------
Co-authored-by: Awni Hannun <awni@apple.com> 
						
						
							
 v0.26.0
						
					 | 
					
						2025-06-02 15:58:46 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						cbad6c3093
					 | 
					
						
						
							
							version (#2237)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-02 15:58:33 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						1b021f6984
					 | 
					
						
						
							
							Fast primitives decide when to use the fallback (#2216)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-02 13:26:37 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						95b7551d65
					 | 
					
						
						
							
							Do not check event.is_signaled() in eval_impl (#2230)
						
						
						
						
						
						
							
						
					 | 
					
						2025-06-02 13:23:34 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						db5a7c6192
					 | 
					
						
						
							
							Add memory cache to CUDA backend (#2221)
						
						
						
						
						
						
						
						* Move BufferCache out of allocator
* Add memory cache to cuda backend allocator
* Simplify BufferCache assuming buf can not be null 
						
						
							
						
					 | 
					
						2025-05-30 12:12:54 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						6ef2f67e7f
					 | 
					
						
						
							
							5bit quants (#2226)
						
						
						
						
						
						
						
						* 5bit quants
* 5bit quants 
						
						
							
						
					 | 
					
						2025-05-30 12:12:10 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						f76ee1ffd2
					 | 
					
						
						
							
							Move some dims utils to common (#2223)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-29 06:48:30 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						54a71f270a
					 | 
					
						
						
							
							Remove unused defines (#2217)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-23 06:14:58 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						55b4062dd8
					 | 
					
						
						
							
							copyright in docs (#2214)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-21 17:13:04 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						79071bfba4
					 | 
					
						
						
							
							Fix out-of-bounds default value in logsumexp/softmax (#2213)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-21 07:25:16 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						7774b87cbd
					 | 
					
						
						
							
							Remove redundant simd_sum in logsumexp (#2210)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-21 07:25:03 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						35c87741cf
					 | 
					
						
						
							
							Build for compute capability 70 instead of 75 (#2209)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-20 19:42:48 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jack Wind
							
						 
					 | 
					
						
						
							
						
						4cbe605214
					 | 
					
						
						
							
							Feat: Allow per-target Metal debug flags (#2201)
						
						
						
						
						
						
						
						* feat: allow per-target Metal debug flags
* formatting fix 
						
						
							
						
					 | 
					
						2025-05-20 10:22:26 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Clement Liaw
							
						 
					 | 
					
						
						
							
						
						ab8883dd55
					 | 
					
						
						
							
							include mlx::core::version() symbols in the mlx static library (#2207)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-20 07:39:11 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						eebe73001a
					 | 
					
						
						
							
							fix large arg reduce (#2206)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-19 13:10:44 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						0359bf02c9
					 | 
					
						
						
							
							Nearest upsample (#2202)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-19 11:23:38 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						237f9e58a8
					 | 
					
						
						
							
							Fix BEFORE keyword in target_include_directories (#2204)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-19 06:10:44 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						8576e6fe36
					 | 
					
						
						
							
							fix conv2d bug + faster conv 1d (#2195)
						
						
						
						
						
						
						
						* fix conv2d bug + faster conv 1d
* revert sort + flaky test 
						
						
							
						
					 | 
					
						2025-05-18 06:05:11 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						0654543dcc
					 | 
					
						
						
							
							Add complex eigh (#2191)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-18 00:18:43 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						48ef3e74e2
					 | 
					
						
						
							
							reduce vjp for all and any (#2193)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-16 08:38:49 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						7d4b378952
					 | 
					
						
						
							
							Include cuda_bf16.h for bfloat16 overloads (#2192)
						
						
						
						
						
						
						
						* Include cuda_bf16.h for bfloat16 overloads
* Add NO_GPU_MULTI(Eig) in cuda backend 
						
						
							
						
					 | 
					
						2025-05-16 06:44:42 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jack Wind
							
						 
					 | 
					
						
						
							
						
						7ff5c41e06
					 | 
					
						
						
							
							Add set_threadgroup_memory_length to CommandEncoder (#2183)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-16 00:28:03 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						602f43e3d1
					 | 
					
						
						
							
							fix conv grad (#2187)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-15 19:20:36 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						a2cadb8218
					 | 
					
						
						
							
							real and imag properties (#2189)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-15 18:17:50 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						c1eb9d05d9
					 | 
					
						
						
							
							non-symmetric eig and eigh (#2188)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-15 13:01:44 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						cf6c939e86
					 | 
					
						
						
							
							Fix some complex vjps (#2178)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-14 23:37:12 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						130df35e1b
					 | 
					
						
						
							
							Add random normal distribution for complex numbers (#2182)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-13 22:43:45 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						0751263dec
					 | 
					
						
						
							
							Fix typo in row_reduce_small (#2179)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-13 20:19:54 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						eca2f3eb97
					 | 
					
						
						
							
							Add remove_index utility (#2173)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-13 17:09:56 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						3aa9cf3f9e
					 | 
					
						
						
							
							Fix put_along_axis for empty arrays (#2181)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-13 14:27:53 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						8f3d208dce
					 | 
					
						
						
							
							Close a couple edge case bugs: hadamard and addmm on empty inputs (#2177)
						
						
						
						
						
						
						
						* handle hadamard and addmm on empty inputs
* fix 
						
						
							
						
					 | 
					
						2025-05-12 10:48:57 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Ivan Fioravanti
							
						 
					 | 
					
						
						
							
						
						caaa3f1f8c
					 | 
					
						
						
							
							Small typos in mx.metal deprecations (#2176)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-11 06:03:47 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						659a51919f
					 | 
					
						
						
							
							patch bump (#2162)
						
						
						
						
						
						
							
 v0.25.2
						
					 | 
					
						2025-05-09 14:35:14 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						6661387066
					 | 
					
						
						
							
							Fix fft for integer overflow (#2161)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-09 14:25:12 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ATurker
							
						 
					 | 
					
						
						
							
						
						a7fae8a176
					 | 
					
						
						
							
							fix: conv_general differences between gpu, cpu (#2070)
						
						
						
						
						
						
						
						* fix general_conv padding
* fix bugs
* add test
---------
Co-authored-by: Awni Hannun <awni@apple.com> 
						
						
							
						
					 | 
					
						2025-05-09 10:26:52 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						0cae0bdac8
					 | 
					
						
						
							
							CUDA backend: backbone (#2075)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-06 21:26:46 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						5a1a5d5ed1
					 | 
					
						
						
							
							fix input coherent kernel launch (#2153)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-05 17:30:50 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cheng
							
						 
					 | 
					
						
						
							
						
						1683975acf
					 | 
					
						
						
							
							Move common gpu primitives to backend/gpu (#2145)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-05 13:45:29 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						af705590ac
					 | 
					
						
						
							
							fix batched vector sdpa (#2152)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-05 13:13:03 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						825124af8f
					 | 
					
						
						
							
							fix bw for elementwise ops (#2151)
						
						
						
						
						
						
						
						* fix bw for elementwise ops
* add compile
* fix
* fix
* fix
* fix 
						
						
							
						
					 | 
					
						2025-05-05 06:15:04 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						9c5e7da507
					 | 
					
						
						
							
							fix compile merging (#2150)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-02 15:08:50 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						481349495b
					 | 
					
						
						
							
							GPU Hadamard for large N (#1879)
						
						
						
						
						
						
							
						
					 | 
					
						2025-05-01 17:19:17 -07:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 |