Awni Hannun 
							
						 
					 
					
						
						
							
						
						1fa0d20a30 
					 
					
						
						
							
							consistently handle all -inf in softmax ( #1470 )  
						
						
						
						
					 
					
						2024-10-08 09:54:02 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						3274c6a087 
					 
					
						
						
							
							Fix array is_available race cases ( #1468 )  
						
						
						
						
					 
					
						2024-10-07 19:13:50 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						9b12093739 
					 
					
						
						
							
							Add the roll op ( #1455 )  
						
						
						
						
					 
					
						2024-10-07 17:21:42 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f374b6ca4d 
					 
					
						
						
							
							Bump nanobind to 2.2 ( #1461 )  
						
						... 
						
						
						
						* bump nanobind
* extension version for tests 
						
						
					 
					
						2024-10-07 16:52:40 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0070e1db40 
					 
					
						
						
							
							Fix deep recursion with siblings ( #1462 )  
						
						... 
						
						
						
						* fix recursion with siblings
* fix
* add test
* increase tol 
						
						
					 
					
						2024-10-07 06:15:33 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						e4534dac17 
					 
					
						
						
							
							Conv grad with groups + bugfix ( #1449 )  
						
						... 
						
						
						
						* fix bug in flipped conv with groups, start of grad for groups
* fix
* fix
* fix + test 
						
						
					 
					
						2024-10-06 07:08:53 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1bdc038bf9 
					 
					
						
						
							
							fix argpartition + faster {arg} sorts / partitions ( #1453 )  
						
						
						
						
					 
					
						2024-10-03 14:21:25 -07:00 
						 
				 
			
				
					
						
							
							
								Lucas Newman 
							
						 
					 
					
						
						
							
						
						4a64d4bff1 
					 
					
						
						
							
							Add support for grouped 1D convolutions to the nn API ( #1444 )  
						
						... 
						
						
						
						* Fix the weight shape for grouped convolutions from the nn API.
* Add tests.
* Pre-commit formatting.
* Add input validation.
* Use integer division instead of casting.
* docs
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-09-28 06:41:07 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						718aea3f1d 
					 
					
						
						
							
							allow take to work with integer index ( #1440 )  
						
						
						
						
					 
					
						2024-09-26 15:58:03 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						195b429d99 
					 
					
						
						
							
							Put along axis + fixe for partition grad ( #1430 )  
						
						... 
						
						
						
						* put along axis, fixes for partition grad
* zeros for arg reduce 
						
						
					 
					
						2024-09-23 10:03:38 -07:00 
						 
				 
			
				
					
						
							
							
								Nripesh Niketan 
							
						 
					 
					
						
						
							
						
						6af5ca35b2 
					 
					
						
						
							
							feat: add cross_product ( #1252 )  
						
						... 
						
						
						
						* feat: add cross_product
* lint
* python binding
* refactor: Improve error message for cross_product function
* refactor: more close to numpy cross product
* refactor: improve error message for cross_product function
* finish
* fix acks
* allow old numpy
* doc
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-09-17 13:12:43 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						914409fef9 
					 
					
						
						
							
							Data parallel helper ( #1407 )  
						
						
						
						
					 
					
						2024-09-16 18:17:21 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d6492b0163 
					 
					
						
						
							
							fix clip ( #1415 )  
						
						
						
						
					 
					
						2024-09-14 16:09:09 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8b30acd7eb 
					 
					
						
						
							
							fix module attribute set, reset, set ( #1403 )  
						
						
						
						
					 
					
						2024-09-11 16:30:42 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						3ae6aabe9f 
					 
					
						
						
							
							throw for certain cases of non captured inputs in compile ( #1401 )  
						
						
						
						
					 
					
						2024-09-09 14:54:31 -07:00 
						 
				 
			
				
					
						
							
							
								Max-Heinrich Laves 
							
						 
					 
					
						
						
							
						
						efeb9c0f02 
					 
					
						
						
							
							Transposed Convolution ( #1245 )  
						
						... 
						
						
						
						* initial implementation for conv_transpose
ran pre-commit
implemented conv_transpose
updated conv_general docstring
updated conv_general docstring
updated code comments
removed commented run_conv_checks
updated acknowledgments
added missing entry to ops.rst
added op to nn.layers
resolved merge conflicts
* removed ConvolutionTranspose primitive as suggested by reviewer
removed ConvolutionTranspose primitive as suggested by reviewer
* remove transpose flag, add another test
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-09-06 19:52:38 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ba3e913c7a 
					 
					
						
						
							
							Simplifications for MLX C ( #1396 )  
						
						... 
						
						
						
						* simplifications for MLX C
* use vectors instead of map
* update examples 
						
						
					 
					
						2024-09-06 19:16:50 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7cca1727af 
					 
					
						
						
							
							Fix slice data size ( #1394 )  
						
						... 
						
						
						
						* fix slice data size and add tests
* fix contiguous flag
* simplify stride and perform copy for non-contiguous arrays
* fix cpu
* comment 
						
						
					 
					
						2024-09-04 19:10:43 -07:00 
						 
				 
			
				
					
						
							
							
								Bhargav Yagnik 
							
						 
					 
					
						
						
							
						
						11371fe251 
					 
					
						
						
							
							Test to prevent bugs like  #1386  ( #1391 )  
						
						... 
						
						
						
						* updated test_array for missing ops
* formatting changes 
						
						
					 
					
						2024-09-04 17:24:30 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						969337345f 
					 
					
						
						
							
							Fix reduce edge case ( #1389 )  
						
						
						
						
					 
					
						2024-09-01 21:37:51 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0d302cd25b 
					 
					
						
						
							
							Fix compiel with byte sized constants ( #1381 )  
						
						
						
						
					 
					
						2024-08-30 17:24:35 -07:00 
						 
				 
			
				
					
						
							
							
								Aditya Dhulipala 
							
						 
					 
					
						
						
							
						
						e6b223df5f 
					 
					
						
						
							
							Pinv ( #875 )  
						
						
						
						
					 
					
						2024-08-27 23:06:12 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						cdb59faea6 
					 
					
						
						
							
							Adds send/recv ops in distributed ( #1366 )  
						
						
						
						
					 
					
						2024-08-26 23:01:37 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						1d94ac3f90 
					 
					
						
						
							
							Add optional headers to `mx.fast.metal_kernel` ( #1358 )  
						
						
						
						
					 
					
						2024-08-26 21:45:45 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						d1183821a7 
					 
					
						
						
							
							int() and float() for mx.array ( #1360 )  
						
						
						
						
					 
					
						2024-08-25 20:41:44 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						8081df79be 
					 
					
						
						
							
							Fix boolean all reduce bug ( #1355 )  
						
						
						
						
					 
					
						2024-08-24 10:09:32 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						b57a52813b 
					 
					
						
						
							
							Further reduction tuning ( #1349 )  
						
						... 
						
						
						
						* More reduction tuning
* Forgotten pdb
* Small column long row specialization 
						
						
					 
					
						2024-08-23 10:35:25 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						da8deb2b62 
					 
					
						
						
							
							fix bug with multiple attributes ( #1348 )  
						
						... 
						
						
						
						Co-authored-by: Alex Barron <abarron22@apple.com > 
						
						
					 
					
						2024-08-23 10:06:15 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						98b6ce3460 
					 
					
						
						
							
							Refactor reductions and fix scatter atomics for large sizes ( #1300 )  
						
						... 
						
						
						
						Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
					 
					
						2024-08-22 16:03:31 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						0fd2a1f4b0 
					 
					
						
						
							
							Custom Metal Kernels from Python ( #1325 )  
						
						... 
						
						
						
						* start
* simple kernels working
* restructure
* inverse example working
* docs + fixes
* missing file
* fix imports
* address comments
* add docs + fix test
* Review comments + refactor to a single function
* update docs
* remove hashing
* fix contig bug in test
* back to a class
* trailing whitespace
* fix tests
* match c++ and python apis
* add link + make args kw_only 
						
						
					 
					
						2024-08-22 13:46:29 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d40e76809f 
					 
					
						
						
							
							Fix rope ( #1340 )  
						
						... 
						
						
						
						* add test
* fix rope
* fix test 
						
						
					 
					
						2024-08-20 17:37:52 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						bb1b76d9dc 
					 
					
						
						
							
							RoPE with frequencies as optional input ( #1337 )  
						
						... 
						
						
						
						* start rope with freq input
* rope with frequencies
* nits
* fix bug
* fix bug + test
* cleanup
* optional base 
						
						
					 
					
						2024-08-19 18:30:50 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ae5b5cabfd 
					 
					
						
						
							
							Fix optimizer reloading from checkpoint ( #1329 )  
						
						... 
						
						
						
						* fix optimizer reloading from checkpoint
* comment 
						
						
					 
					
						2024-08-15 07:33:23 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						99bb7d3a58 
					 
					
						
						
							
							GPU mx.sign for complex64 ( #1326 )  
						
						
						
						
					 
					
						2024-08-14 07:54:53 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						eaaea02010 
					 
					
						
						
							
							Add isfinite ( #1318 )  
						
						... 
						
						
						
						* isfinite
* remove reduce test since fix is not complete 
						
						
					 
					
						2024-08-13 14:49:28 -07:00 
						 
				 
			
				
					
						
							
							
								Bhargav Yagnik 
							
						 
					 
					
						
						
							
						
						a098bc92e0 
					 
					
						
						
							
							Fix: Preserve input dtype in Dropout layer output ( #1323 )  
						
						... 
						
						
						
						* Fix: Preserve input dtype in Dropout layer output
- Modified Dropout implementation to ensure that the output dtype matches the input dtype.
- This resolves the issue #1321 
* Update test cases in test_nn.py
- Revised test cases to align with updated dropout code
- Fixed assertion method: replaced self.assertTrue with self.assertEqual for accurate comparisons in test_nn.py -> test_rope, test_alibi and test_dropout,
* updated dropout.py 
						
						
					 
					
						2024-08-13 11:54:21 -07:00 
						 
				 
			
				
					
						
							
							
								Brian Keene 
							
						 
					 
					
						
						
							
						
						19fb69e2ed 
					 
					
						
						
							
							Add memory_efficient_threshold kwarg to sdpa kernel ( #1319 )  
						
						... 
						
						
						
						Allows opt-in to memory efficient GPU shader at proscribed sequence
length.  Otherwise, utilizes aggregate MLX primitives for best latency. 
						
						
					 
					
						2024-08-12 12:57:09 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						32668a7317 
					 
					
						
						
							
							CPU mx.linalg.cholesky_inverse and mx.linalg.tri_inv ( #1307 )  
						
						... 
						
						
						
						* add cholesky inv + tri inv
* always run tri_inv on cpu
* consistent naming 
						
						
					 
					
						2024-08-08 15:18:02 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						780c197f95 
					 
					
						
						
							
							Fix test tolerance and patch bump ( #1315 )  
						
						
						
						
					 
					
						2024-08-08 14:51:09 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						635ccd9e25 
					 
					
						
						
							
							Add "edge" mode to mx.pad ( #1309 )  
						
						... 
						
						
						
						* Add edge padding mode
* fix pad in pooling
* string arg instead of enum 
						
						
					 
					
						2024-08-06 11:23:10 -07:00 
						 
				 
			
				
					
						
							
							
								nicolov 
							
						 
					 
					
						
						
							
						
						8c9f0278b9 
					 
					
						
						
							
							Add vmap to scatter ( #1200 )  
						
						... 
						
						
						
						* Add vmap to scatter
* updates
* vmap updates + a few more tests
* bug fix
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-08-05 20:12:27 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						58d0e199e1 
					 
					
						
						
							
							add bfloat conv for windograd ( #1306 )  
						
						... 
						
						
						
						* add bfloat conv for windograd
* accumulate in fp32
* accumulate in fp32
* accumulate in bf16 
						
						
					 
					
						2024-08-05 15:51:13 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						10b5835501 
					 
					
						
						
							
							fix creating array from bf16 tensors in jax / torch ( #1305 )  
						
						
						
						
					 
					
						2024-08-01 16:20:51 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						c52d1600f0 
					 
					
						
						
							
							Fused Affine Quantize/Dequantize ops ( #1282 )  
						
						... 
						
						
						
						* Add fast affine dequantize
* add full quantize kernel
* fused kernel with scale/bias computation
* fix docstring
* fix no jit error
* fix test
* test fix
* reduce fast api to only affine_quantize 
						
						
					 
					
						2024-07-29 15:11:38 -07:00 
						 
				 
			
				
					
						
							
							
								Atakan Tekparmak 
							
						 
					 
					
						
						
							
						
						6e06e3a904 
					 
					
						
						
							
							feat: Added "tanh" option to GELU approximation ( #1268 )  
						
						
						
						
					 
					
						2024-07-28 09:07:56 +02:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7b456fd2c0 
					 
					
						
						
							
							Array api ( #1289 )  
						
						... 
						
						
						
						* some updates for numpy 2.0 and array api
* some updates for numpy 2.0 and array api
* fix array api doc 
						
						
					 
					
						2024-07-26 10:40:49 -07:00 
						 
				 
			
				
					
						
							
							
								Anton Belov 
							
						 
					 
					
						
						
							
						
						5029894662 
					 
					
						
						
							
							[Issue  #1187 ] Add nan_to_num function initial attempt ( #1247 )  
						
						... 
						
						
						
						* initial attempt, working with wrong types
* not compiling; mx.float16 and mx.bfloat16 tests added
* fix nan to num
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-07-25 09:57:37 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						baf9fa5f42 
					 
					
						
						
							
							Einsum ( #1269 )  
						
						... 
						
						
						
						* einsum initial
* fix comma break
* sum axis was wrong
* small cleanups
* python binding
* changed bindings to resemble numpy
* remove todo comment
* comment changes
* add count of operands/inputs
* fail fast if operands list is empty
* ignore comma if no output
* einsum path matching numpy
* getting somewhere with path
* remove print
* it passes the first test
* moved einsum tests to seperate file
* seperated einsum path
* moved einsum naive
* remove space from equation
* fast fail if no operands passed
* update tests and remove printf
* small cleanup
* some more cleanups
* removed python helper file
* ack
* utilize std for finding min in vector
* duplicate def
* remove the tuple as it was unreadable
* moved einsum_naive back to ops
* remaining isn't needed
* avoid creating another set
* cleanup
* greedy path, start of naive einsum
* more einsum
* fix some bugs
* some more fixes, tests pass
* benchmark
* some simplify
* fix einsum and test
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
* add a bunch more tests and fix a bunch more bugs
* some docs nits
---------
Co-authored-by: dc-dc-dc <dgcruz983@gmail.com >
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
					 
					
						2024-07-25 09:36:44 -07:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						7f914365fd 
					 
					
						
						
							
							Fix GPU sort for large arrays ( #1285 )  
						
						... 
						
						
						
						* Fix GPU sort for large arrays 
						
						
					 
					
						2024-07-24 14:37:10 -07:00 
						 
				 
			
				
					
						
							
							
								Paul Paczuski 
							
						 
					 
					
						
						
							
						
						ebd7135b50 
					 
					
						
						
							
							Improve stability of BCE loss calculation for input probabilities close to or exactly 0 or 1 ( #1280 )  
						
						... 
						
						
						
						* Improve stability of BCE loss calculation
* Standardize comment
* Apply formatting with black via pre-commit
* Add usage recommendation to docstring
* Update python/mlx/nn/losses.py
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com > 
						
						
					 
					
						2024-07-24 08:38:22 -07:00