Awni Hannun 
							
						 
					 
					
						
						
							
						
						da5912e4f2 
					 
					
						
						
							
							fix custom metal extension ( #2446 )  
						
						
						
						
					 
					
						2025-07-31 06:25:36 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						2204182bba 
					 
					
						
						
							
							Make CI faster ( #2440 )  
						
						
						
						
					 
					
						2025-07-30 02:26:36 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						970dbe8e25 
					 
					
						
						
							
							Use ccache in CI ( #2414 )  
						
						... 
						
						
						
						* Detect ccache
* Use ccache in CI
* Separate cache for different images
* Test both 12.2 and 12.9 for PRs 
						
						
					 
					
						2025-07-29 08:43:22 +09:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						6f5874a2f2 
					 
					
						
						
							
							[CUDA] Initial implementation of Convolution with cuDNN ( #2385 )  
						
						... 
						
						
						
						* Link with cuDNN
* Initial implementation
* Remove backend apis
* Fix recording cudnn conv
* More unused backend apis
* Fix C++ conv tests
* include cudnn as python dep
* Install libcudnn9-dev-cuda-12 in CI
* cudnn only accepts contiguous inputs
* Switch to backend apis
* Plan needs to be kept alive
* Turn off tf32
* Add cache
* Test the native cuda graph api
* Set cudnn stream before execution
* Make LRUCache more like a normal container
* Do error check for cublas handle
* Zero-initilizing array
* Use tf32 for conv
* Skip TestConv.test_torch_conv_2D test
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-07-25 08:12:10 +09:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						70dc336785 
					 
					
						
						
							
							Test on cuda 12.2 and 12.9 ( #2413 )  
						
						
						
						
					 
					
						2025-07-24 06:06:15 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d1f4d291e8 
					 
					
						
						
							
							Fix uv install and add dev release ( #2411 )  
						
						... 
						
						
						
						* fix uv install and add dev release
* fix docstring
* pin cuda deps
* cuda release on cpu-only machine 
						
						
					 
					
						2025-07-23 16:54:19 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						63f663d9c6 
					 
					
						
						
							
							fix cuda manylinux version to match others ( #2388 )  
						
						
						
						
					 
					
						2025-07-18 21:02:16 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						84b4d96efa 
					 
					
						
						
							
							fix release build + patch bump ( #2387 )  
						
						
						
						
					 
					
						2025-07-18 14:47:37 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b2273733ea 
					 
					
						
						
							
							Test with CUDA 12.2 ( #2375 )  
						
						... 
						
						
						
						* Test with CUDA 12.0
* try older image
* fix cpu sort 
						
						
					 
					
						2025-07-16 13:00:37 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f409b229a4 
					 
					
						
						
							
							fix ring distributed test ( #2380 )  
						
						
						
						
					 
					
						2025-07-16 11:25:24 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f0a0b077a0 
					 
					
						
						
							
							Install linux with mlx[cuda] and mlx[cpu] ( #2356 )  
						
						... 
						
						
						
						* install linux with mlx[cuda] and mlx[cpu]
* temp for testing
* cleanup circle, fix cuda repair
* update circle
* update circle
* decouple python bindings from core libraries 
						
						
					 
					
						2025-07-14 17:17:33 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						e569803d7c 
					 
					
						
						
							
							update linux build ( #2370 )  
						
						
						
						
					 
					
						2025-07-14 15:13:56 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						a4fcc893cd 
					 
					
						
						
							
							auto build linux release ( #2341 )  
						
						
						
						
					 
					
						2025-07-07 09:29:23 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						19facd4b20 
					 
					
						
						
							
							Build with all cpu cores by default ( #2336 )  
						
						
						
						
					 
					
						2025-07-07 06:06:45 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						76831ed83d 
					 
					
						
						
							
							Build CUDA release in Circle ( #2306 )  
						
						... 
						
						
						
						* cuda release
* add license 
						
						
					 
					
						2025-06-19 15:26:36 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4fda5fbdf9 
					 
					
						
						
							
							add python testing for cuda with ability to skip list of tests ( #2295 )  
						
						
						
						
					 
					
						2025-06-15 10:56:48 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c35f4d089a 
					 
					
						
						
							
							start cuda circle config ( #2256 )  
						
						... 
						
						
						
						* rebase
* fix metal kernel linking issue on cuda
* start cuda circle config 
						
						
					 
					
						2025-06-10 21:19:47 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7275ac7523 
					 
					
						
						
							
							Fix release build ( #2072 )  
						
						
						
						
					 
					
						2025-04-12 20:41:58 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9c6953bda7 
					 
					
						
						
							
							Fix stubgen ( #2065 )  
						
						... 
						
						
						
						* Fix stubgen
* add multi optim to docs 
						
						
					 
					
						2025-04-11 12:02:54 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						7b3b8fa000 
					 
					
						
						
							
							Fix ci release ( #2045 )  
						
						
						
						
					 
					
						2025-04-04 20:25:01 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						13b26775f1 
					 
					
						
						
							
							use minimum deployment target ( #2016 )  
						
						
						
						
					 
					
						2025-03-28 14:31:53 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						d343782c8b 
					 
					
						
						
							
							Cross platform libmpi loading ( #1975 )  
						
						
						
						
					 
					
						2025-03-21 11:23:10 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d2a94f9e6a 
					 
					
						
						
							
							Only compile warnings as errors for circle ( #1957 )  
						
						
						
						
					 
					
						2025-03-12 13:08:19 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ded914f442 
					 
					
						
						
							
							Small distributed launch helper ( #1810 )  
						
						
						
						
					 
					
						2025-01-29 17:55:04 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ccb61d7aae 
					 
					
						
						
							
							Ring distributed backend ( #1784 )  
						
						
						
						
					 
					
						2025-01-27 22:15:01 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f110357aaa 
					 
					
						
						
							
							Bump nanobind to 2.4 + fix ( #1710 )  
						
						... 
						
						
						
						* bump nanobind to 2.4 + fix
* fix 
						
						
					 
					
						2024-12-17 10:57:54 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9bd3a7102f 
					 
					
						
						
							
							add python 3.13 to circle ( #1553 )  
						
						
						
						
					 
					
						2024-11-01 20:55:35 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						343aa46b78 
					 
					
						
						
							
							No more 3.8 ( #1493 )  
						
						
						
						
					 
					
						2024-10-16 17:51:38 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b8ab89b413 
					 
					
						
						
							
							Docs in ci ( #1491 )  
						
						... 
						
						
						
						* docs in circle 
						
						
					 
					
						2024-10-15 17:40:00 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0eef4febfd 
					 
					
						
						
							
							bump mac tests to use py39 ( #1485 )  
						
						
						
						
					 
					
						2024-10-14 10:40:32 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b54a70ec2d 
					 
					
						
						
							
							Make push button linux distribution ( #1476 )  
						
						... 
						
						
						
						* try again
* try again
* try again
* try again
* try again
* try again
* try again
* try again
* .circleci/config.yml
* one more fix
* nit 
						
						
					 
					
						2024-10-14 06:21:44 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f374b6ca4d 
					 
					
						
						
							
							Bump nanobind to 2.2 ( #1461 )  
						
						... 
						
						
						
						* bump nanobind
* extension version for tests 
						
						
					 
					
						2024-10-07 16:52:40 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						fef3c4ec1d 
					 
					
						
						
							
							Fix mpi test in CI ( #1456 )  
						
						... 
						
						
						
						* Fix mpi test in CI
* Set bind to none 
						
						
					 
					
						2024-10-06 06:09:17 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						02efb310ca 
					 
					
						
						
							
							Xcode 160 ( #1384 )  
						
						... 
						
						
						
						* xcode 16.0 with debug tests
* limit nproc for builds
* vmap bug
* assert bug
* run python tests in debug mode
* fix view, bool copies preserve bits'
* actual view fix 
						
						
					 
					
						2024-09-10 15:15:17 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						28be4de7c2 
					 
					
						
						
							
							Fix JIT reductions ( #1373 )  
						
						
						
						
					 
					
						2024-08-28 16:39:11 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f9e00efe31 
					 
					
						
						
							
							fix nanobind and stub gen in circle ( #1346 )  
						
						
						
						
					 
					
						2024-08-22 14:07:27 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						709ccc6800 
					 
					
						
						
							
							install mpi for release build ( #1199 )  
						
						
						
						
					 
					
						2024-06-10 10:09:32 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						50dfb664db 
					 
					
						
						
							
							Comms ( #1097 )  
						
						... 
						
						
						
						* Start the communications branch using MPI
* Add ops and primitives
* Add python bindings for distributed 
						
						
					 
					
						2024-05-23 17:04:02 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						226748b3e7 
					 
					
						
						
							
							JIT compile option for binary minimization ( #1091 )  
						
						... 
						
						
						
						* try cpp 20 for compile
* unary, binary, ternary in jit
* nits
* fix gather/scatter
* fix rebase
* reorg compile
* add ternary to compile
* jit copy
* jit compile flag
* fix build
* use linked function for ternary
* some nits
* docs + circle min size build
* docs + circle min size build
* fix extension
* fix no cpu build
* improve includes 
						
						
					 
					
						2024-05-22 12:57:13 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						23406c9e9e 
					 
					
						
						
							
							Choose the right MLX bf16 for extensions ( #1135 )  
						
						... 
						
						
						
						* default to custom bf
* choose right bf
* fix extensions
* fix circle conf 
						
						
					 
					
						2024-05-17 15:09:28 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8b76571896 
					 
					
						
						
							
							Fix extensions ( #1126 )  
						
						... 
						
						
						
						* fix extensions
* title
* enable circle
* fix nanobind tag
* fix bug in doc
* try to fix config
* typo 
						
						
					 
					
						2024-05-16 15:36:25 -07:00 
						 
				 
			
				
					
						
							
							
								Mike Drob 
							
						 
					 
					
						
						
							
						
						2263e4b279 
					 
					
						
						
							
							Experiment with medium machines for CI ( #1000 )  
						
						
						
						
					 
					
						2024-05-13 19:40:19 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						ed83908931 
					 
					
						
						
							
							fix gguf loading quants ( #1014 )  
						
						... 
						
						
						
						* fix gguf loading quants
* fix nanobind install
* actual fix 
						
						
					 
					
						2024-04-19 12:24:07 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						1e16331d9c 
					 
					
						
						
							
							post nanobind docs fixes and some updates ( #889 )  
						
						... 
						
						
						
						* post nanobind docs fixes and some updates
* one more doc nit
* fix for stubs and latex 
						
						
					 
					
						2024-03-24 15:03:27 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9a8ee00246 
					 
					
						
						
							
							Switch to nanobind ( #839 )  
						
						... 
						
						
						
						* mostly builds
* most tests pass
* fix circle build
* add back buffer protocol
* includes
* fix for py38
* limit to cpu device
* include
* fix stubs
* move signatures for docs
* stubgen + docs fix
* doc for compiled function, comments 
						
						
					 
					
						2024-03-18 20:12:25 -07:00 
						 
				 
			
				
					
						
							
							
								Justin Deschenaux 
							
						 
					 
					
						
						
							
						
						8e5600022a 
					 
					
						
						
							
							Implement RNN, GRU, LSTM ( #268 )  
						
						... 
						
						
						
						* RNN base implementation
* Address comments+format
* nits in docs
* add tests for prb
* fix test
* add a couple tests
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2024-03-11 21:14:44 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f512b905c7 
					 
					
						
						
							
							Minimum xcode / sdk ( #800 )  
						
						... 
						
						
						
						* minimum xcode /sdk
* try multiple xcode versions in CI
* update python
* metal validation for python tests 
						
						
					 
					
						2024-03-07 08:19:43 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ad4a45e615 
					 
					
						
						
							
							Fix the release builds in CI ( #729 )  
						
						
						
						
					 
					
						2024-02-22 14:09:13 -08:00 
						 
				 
			
				
					
						
							
							
								Mike Drob 
							
						 
					 
					
						
						
							
						
						165abf0e4c 
					 
					
						
						
							
							Auto-run PRs from contributors ( #692 )  
						
						
						
						
					 
					
						2024-02-15 17:30:35 -08:00 
						 
				 
			
				
					
						
							
							
								Jagrit Digani 
							
						 
					 
					
						
						
							
						
						1a48713d32 
					 
					
						
						
							
							Update gather and scatter to not use Argument Encoder ( #683 )  
						
						... 
						
						
						
						* Replace argument encoder usage for gather and scatter
* Use constant address space for shapes and strides
* Split gather and scatter to improve compile times
* Enable the GPU tests
* Update the CI config
* Fix scatter dispatch for scalar indices
* Remove arg encoder utils
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
					 
					
						2024-02-14 13:42:13 -08:00