Cheng 
							
						 
					 
					
						
						
							
						
						c1e3340b23 
					 
					
						
						
							
							Set ccache size before building ( #2570 )  
						
						
						
						
					 
					
						2025-09-07 09:00:31 +09:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						827003d568 
					 
					
						
						
							
							fix METAL quantization in JIT ( #2553 )  
						
						
						
						
					 
					
						2025-08-28 18:26:25 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d363a76aa4 
					 
					
						
						
							
							Bump xcode in circle ( #2551 )  
						
						... 
						
						
						
						* bump xcode in circle
* bump xcode in circle
* bump xcode in circle 
						
						
					 
					
						2025-08-28 13:13:34 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						a9bac3d9e5 
					 
					
						
						
							
							Run CPP tests for CUDA build in CI ( #2544 )  
						
						
						
						
					 
					
						2025-08-27 08:06:46 +09:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						2ca75bb529 
					 
					
						
						
							
							Remove nccl install in release ( #2542 )  
						
						
						
						
					 
					
						2025-08-25 15:20:18 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d2f540f4e0 
					 
					
						
						
							
							Use nccl header only when nccl is not present ( #2539 )  
						
						... 
						
						
						
						* use nccl header only when nccl is not present
* larger machine for cuda build 
						
						
					 
					
						2025-08-25 14:17:25 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						f55b6f1f2f 
					 
					
						
						
							
							Enable COMPILE_WARNING_AS_ERROR for linux builds in CI ( #2534 )  
						
						
						
						
					 
					
						2025-08-24 15:33:08 +09:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						068a4612e9 
					 
					
						
						
							
							nccl default for backend=any ( #2528 )  
						
						... 
						
						
						
						* nccl default for backend=any
* check num gpus + ensure row contiguous for all reduce
* comment 
						
						
					 
					
						2025-08-22 12:24:27 -07:00 
						 
				 
			
				
					
						
							
							
								Anastasiia Filippova 
							
						 
					 
					
						
						
							
						
						9392fc3f88 
					 
					
						
						
							
							NCCL backend ( #2476 )  
						
						
						
						
					 
					
						2025-08-21 11:56:15 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						da5912e4f2 
					 
					
						
						
							
							fix custom metal extension ( #2446 )  
						
						
						
						
					 
					
						2025-07-31 06:25:36 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						2204182bba 
					 
					
						
						
							
							Make CI faster ( #2440 )  
						
						
						
						
					 
					
						2025-07-30 02:26:36 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						970dbe8e25 
					 
					
						
						
							
							Use ccache in CI ( #2414 )  
						
						... 
						
						
						
						* Detect ccache
* Use ccache in CI
* Separate cache for different images
* Test both 12.2 and 12.9 for PRs 
						
						
					 
					
						2025-07-29 08:43:22 +09:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						6f5874a2f2 
					 
					
						
						
							
							[CUDA] Initial implementation of Convolution with cuDNN ( #2385 )  
						
						... 
						
						
						
						* Link with cuDNN
* Initial implementation
* Remove backend apis
* Fix recording cudnn conv
* More unused backend apis
* Fix C++ conv tests
* include cudnn as python dep
* Install libcudnn9-dev-cuda-12 in CI
* cudnn only accepts contiguous inputs
* Switch to backend apis
* Plan needs to be kept alive
* Turn off tf32
* Add cache
* Test the native cuda graph api
* Set cudnn stream before execution
* Make LRUCache more like a normal container
* Do error check for cublas handle
* Zero-initilizing array
* Use tf32 for conv
* Skip TestConv.test_torch_conv_2D test
---------
Co-authored-by: Awni Hannun <awni@apple.com > 
						
						
					 
					
						2025-07-25 08:12:10 +09:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						70dc336785 
					 
					
						
						
							
							Test on cuda 12.2 and 12.9 ( #2413 )  
						
						
						
						
					 
					
						2025-07-24 06:06:15 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d1f4d291e8 
					 
					
						
						
							
							Fix uv install and add dev release ( #2411 )  
						
						... 
						
						
						
						* fix uv install and add dev release
* fix docstring
* pin cuda deps
* cuda release on cpu-only machine 
						
						
					 
					
						2025-07-23 16:54:19 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						63f663d9c6 
					 
					
						
						
							
							fix cuda manylinux version to match others ( #2388 )  
						
						
						
						
					 
					
						2025-07-18 21:02:16 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						84b4d96efa 
					 
					
						
						
							
							fix release build + patch bump ( #2387 )  
						
						
						
						
					 
					
						2025-07-18 14:47:37 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b2273733ea 
					 
					
						
						
							
							Test with CUDA 12.2 ( #2375 )  
						
						... 
						
						
						
						* Test with CUDA 12.0
* try older image
* fix cpu sort 
						
						
					 
					
						2025-07-16 13:00:37 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f409b229a4 
					 
					
						
						
							
							fix ring distributed test ( #2380 )  
						
						
						
						
					 
					
						2025-07-16 11:25:24 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f0a0b077a0 
					 
					
						
						
							
							Install linux with mlx[cuda] and mlx[cpu] ( #2356 )  
						
						... 
						
						
						
						* install linux with mlx[cuda] and mlx[cpu]
* temp for testing
* cleanup circle, fix cuda repair
* update circle
* update circle
* decouple python bindings from core libraries 
						
						
					 
					
						2025-07-14 17:17:33 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						e569803d7c 
					 
					
						
						
							
							update linux build ( #2370 )  
						
						
						
						
					 
					
						2025-07-14 15:13:56 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						a4fcc893cd 
					 
					
						
						
							
							auto build linux release ( #2341 )  
						
						
						
						
					 
					
						2025-07-07 09:29:23 -07:00 
						 
				 
			
				
					
						
							
							
								Cheng 
							
						 
					 
					
						
						
							
						
						19facd4b20 
					 
					
						
						
							
							Build with all cpu cores by default ( #2336 )  
						
						
						
						
					 
					
						2025-07-07 06:06:45 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						76831ed83d 
					 
					
						
						
							
							Build CUDA release in Circle ( #2306 )  
						
						... 
						
						
						
						* cuda release
* add license 
						
						
					 
					
						2025-06-19 15:26:36 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						4fda5fbdf9 
					 
					
						
						
							
							add python testing for cuda with ability to skip list of tests ( #2295 )  
						
						
						
						
					 
					
						2025-06-15 10:56:48 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						c35f4d089a 
					 
					
						
						
							
							start cuda circle config ( #2256 )  
						
						... 
						
						
						
						* rebase
* fix metal kernel linking issue on cuda
* start cuda circle config 
						
						
					 
					
						2025-06-10 21:19:47 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7275ac7523 
					 
					
						
						
							
							Fix release build ( #2072 )  
						
						
						
						
					 
					
						2025-04-12 20:41:58 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9c6953bda7 
					 
					
						
						
							
							Fix stubgen ( #2065 )  
						
						... 
						
						
						
						* Fix stubgen
* add multi optim to docs 
						
						
					 
					
						2025-04-11 12:02:54 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						7b3b8fa000 
					 
					
						
						
							
							Fix ci release ( #2045 )  
						
						
						
						
					 
					
						2025-04-04 20:25:01 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						13b26775f1 
					 
					
						
						
							
							use minimum deployment target ( #2016 )  
						
						
						
						
					 
					
						2025-03-28 14:31:53 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						d343782c8b 
					 
					
						
						
							
							Cross platform libmpi loading ( #1975 )  
						
						
						
						
					 
					
						2025-03-21 11:23:10 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d2a94f9e6a 
					 
					
						
						
							
							Only compile warnings as errors for circle ( #1957 )  
						
						
						
						
					 
					
						2025-03-12 13:08:19 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ded914f442 
					 
					
						
						
							
							Small distributed launch helper ( #1810 )  
						
						
						
						
					 
					
						2025-01-29 17:55:04 -08:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ccb61d7aae 
					 
					
						
						
							
							Ring distributed backend ( #1784 )  
						
						
						
						
					 
					
						2025-01-27 22:15:01 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f110357aaa 
					 
					
						
						
							
							Bump nanobind to 2.4 + fix ( #1710 )  
						
						... 
						
						
						
						* bump nanobind to 2.4 + fix
* fix 
						
						
					 
					
						2024-12-17 10:57:54 -08:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						9bd3a7102f 
					 
					
						
						
							
							add python 3.13 to circle ( #1553 )  
						
						
						
						
					 
					
						2024-11-01 20:55:35 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						343aa46b78 
					 
					
						
						
							
							No more 3.8 ( #1493 )  
						
						
						
						
					 
					
						2024-10-16 17:51:38 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b8ab89b413 
					 
					
						
						
							
							Docs in ci ( #1491 )  
						
						... 
						
						
						
						* docs in circle 
						
						
					 
					
						2024-10-15 17:40:00 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						0eef4febfd 
					 
					
						
						
							
							bump mac tests to use py39 ( #1485 )  
						
						
						
						
					 
					
						2024-10-14 10:40:32 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						b54a70ec2d 
					 
					
						
						
							
							Make push button linux distribution ( #1476 )  
						
						... 
						
						
						
						* try again
* try again
* try again
* try again
* try again
* try again
* try again
* try again
* .circleci/config.yml
* one more fix
* nit 
						
						
					 
					
						2024-10-14 06:21:44 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f374b6ca4d 
					 
					
						
						
							
							Bump nanobind to 2.2 ( #1461 )  
						
						... 
						
						
						
						* bump nanobind
* extension version for tests 
						
						
					 
					
						2024-10-07 16:52:40 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						fef3c4ec1d 
					 
					
						
						
							
							Fix mpi test in CI ( #1456 )  
						
						... 
						
						
						
						* Fix mpi test in CI
* Set bind to none 
						
						
					 
					
						2024-10-06 06:09:17 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						02efb310ca 
					 
					
						
						
							
							Xcode 160 ( #1384 )  
						
						... 
						
						
						
						* xcode 16.0 with debug tests
* limit nproc for builds
* vmap bug
* assert bug
* run python tests in debug mode
* fix view, bool copies preserve bits'
* actual view fix 
						
						
					 
					
						2024-09-10 15:15:17 -07:00 
						 
				 
			
				
					
						
							
							
								Alex Barron 
							
						 
					 
					
						
						
							
						
						28be4de7c2 
					 
					
						
						
							
							Fix JIT reductions ( #1373 )  
						
						
						
						
					 
					
						2024-08-28 16:39:11 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						f9e00efe31 
					 
					
						
						
							
							fix nanobind and stub gen in circle ( #1346 )  
						
						
						
						
					 
					
						2024-08-22 14:07:27 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						709ccc6800 
					 
					
						
						
							
							install mpi for release build ( #1199 )  
						
						
						
						
					 
					
						2024-06-10 10:09:32 -07:00 
						 
				 
			
				
					
						
							
							
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						50dfb664db 
					 
					
						
						
							
							Comms ( #1097 )  
						
						... 
						
						
						
						* Start the communications branch using MPI
* Add ops and primitives
* Add python bindings for distributed 
						
						
					 
					
						2024-05-23 17:04:02 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						226748b3e7 
					 
					
						
						
							
							JIT compile option for binary minimization ( #1091 )  
						
						... 
						
						
						
						* try cpp 20 for compile
* unary, binary, ternary in jit
* nits
* fix gather/scatter
* fix rebase
* reorg compile
* add ternary to compile
* jit copy
* jit compile flag
* fix build
* use linked function for ternary
* some nits
* docs + circle min size build
* docs + circle min size build
* fix extension
* fix no cpu build
* improve includes 
						
						
					 
					
						2024-05-22 12:57:13 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						23406c9e9e 
					 
					
						
						
							
							Choose the right MLX bf16 for extensions ( #1135 )  
						
						... 
						
						
						
						* default to custom bf
* choose right bf
* fix extensions
* fix circle conf 
						
						
					 
					
						2024-05-17 15:09:28 -07:00 
						 
				 
			
				
					
						
							
							
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						8b76571896 
					 
					
						
						
							
							Fix extensions ( #1126 )  
						
						... 
						
						
						
						* fix extensions
* title
* enable circle
* fix nanobind tag
* fix bug in doc
* try to fix config
* typo 
						
						
					 
					
						2024-05-16 15:36:25 -07:00