Alex Barron
							
						 
					 | 
					
						
						
							
						
						3507c104a5
					 | 
					
						
						
							
							add test
						
						
						
						
						
						
							
						
					 | 
					
						2024-12-06 00:45:01 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						12a4d89a7c
					 | 
					
						
						
							
							working qsdpa
						
						
						
						
						
						
							
						
					 | 
					
						2024-12-06 00:21:05 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						e047fd977d
					 | 
					
						
						
							
							compile changes if stream changes (#1644)
						
						
						
						
						
						
							
						
					 | 
					
						2024-12-03 14:37:44 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jagrit Digani
							
						 
					 | 
					
						
						
							
						
						9d40e521d7
					 | 
					
						
						
							
							Stop matrix copies with new attention kernel (#1639)
						
						
						
						
						
						
							
						
					 | 
					
						2024-12-02 14:12:38 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						1445dcaa60
					 | 
					
						
						
							
							let class predicate specify quantization parameters (#1638)
						
						
						
						
						
						
							
						
					 | 
					
						2024-12-02 14:09:28 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jesper Stemann Andersen
							
						 
					 | 
					
						
						
							
						
						e4eeb4e910
					 | 
					
						
						
							
							Added missing unordered_map includes (#1635)
						
						
						
						
						
						
						
						* Added missing includes in mlx/io.h and mlx/backend/metal/metal.h
* Added additional missing unordered_map includes that fixes build on FreeBSD 
						
						
							
						
					 | 
					
						2024-12-02 07:03:03 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						aa86876813
					 | 
					
						
						
							
							fix transformer decoder post norm LN (#1637)
						
						
						
						
						
						
							
						
					 | 
					
						2024-12-02 07:02:17 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jesper Stemann Andersen
							
						 
					 | 
					
						
						
							
						
						974bb54ab2
					 | 
					
						
						
							
							CMake: Enabled using Accelerate on x86_64 / x64 (#1625)
						
						
						
						
						
						
						
						* CMake: Enabled using Accelerate on x86_64 / x64
Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761
* CMake: Removed superfluous MLX_BUILD_ARM 
						
						
							
						
					 | 
					
						2024-11-28 10:55:45 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Ikko Eltociear Ashimine
							
						 
					 | 
					
						
						
							
						
						9bc2183a31
					 | 
					
						
						
							
							docs: update device.cpp (#1632)
						
						
						
						
						
						
						
						unecessary -> unnecessary 
						
						
							
						
					 | 
					
						2024-11-27 20:58:26 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						d4b222b6d3
					 | 
					
						
						
							
							Fix some leaks and races (#1629)
						
						
						
						
						
						
						
						* fix leak and fix potential race
* more leak fixes
* fix one more 
						
						
							
						
					 | 
					
						2024-11-27 20:01:20 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jesper Stemann Andersen
							
						 
					 | 
					
						
						
							
						
						af2af818a6
					 | 
					
						
						
							
							Enables build for *-linux-musl (#1627)
						
						
						
						
						
						
						
						Also contributes to being able to build for *-w64-mingw32.
Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761 
						
						
							
						
					 | 
					
						2024-11-27 13:14:24 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jesper Stemann Andersen
							
						 
					 | 
					
						
						
							
						
						698e63a608
					 | 
					
						
						
							
							CMake: Build with dlfcn-win32 to have dlopen etc. on win32 (#1628)
						
						
						
						
						
						
						
						Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761 
						
						
							
						
					 | 
					
						2024-11-27 13:14:13 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						211411faf2
					 | 
					
						
						
							
							fix large ops (#1620)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-24 09:17:10 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						bb303c45a5
					 | 
					
						
						
							
							version (#1617)
						
						
						
						
						
						
							
 v0.21.0
						
					 | 
					
						2024-11-22 12:00:03 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						6f7986d592
					 | 
					
						
						
							
							Cleaner qmv/qvm (#1616)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-22 11:14:08 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						7cbb4aef17
					 | 
					
						
						
							
							Doc fix (#1615)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-22 11:12:25 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Jagrit Digani
							
						 
					 | 
					
						
						
							
						
						02bec0bb6d
					 | 
					
						
						
							
							Matrix Attention kernel  (#1610)
						
						
						
						
						
						
						
						* Rough INIT
* [WIP]: Loading and Matmuls added
* [WIP]: Reductions and min working aligned kernel at headdim = 64
* [WIP] Added headdim 80 for testing
* [WIP] Update dispatch params for testing
* [WIP] Add support for unaligned seq lengths - still looks messy
* Update sdpa_benchmarks
* Update sdpa_benchmarks
* Update sdpa_benchmarks
* Enable gqa support
* Update benchmark and switch off 128 headdim
* Update headdim 128 tuning
* Remove older fast attention code. Write out O strided
* Disable hd=128 until further optimizations
* Enable bf16
* Fix data size bug
* Enable attn build outside of jit 
						
						
							
						
					 | 
					
						2024-11-22 10:34:05 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						c79f6a4a8c
					 | 
					
						
						
							
							3 and 6 bit quantization (#1613)
						
						
						
						
						
						
						
						* Support 3 and 6 bit quantization 
						
						
							
						
					 | 
					
						2024-11-22 10:22:13 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						0c5eea226b
					 | 
					
						
						
							
							Reduce specializations (#1607)
						
						
						
						
						
						
						
						* start of reduce specializations
* fix all reduce
* fix many dims
* fix
* non-jit tests clear
* cleanup instantiations
* cpu merges
* change dim specializations
* optimize
* fix jit
* fix jit
* use higher precision for integer sum+prod
* fixes 
						
						
							
						
					 | 
					
						2024-11-21 19:53:00 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						dcca0d7477
					 | 
					
						
						
							
							contiguous op / prim (#1612)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-21 19:51:49 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Cocoa
							
						 
					 | 
					
						
						
							
						
						0d5e7716ad
					 | 
					
						
						
							
							fix typo: accross -> across (#1609)
						
						
						
						
						
						
						
						Signed-off-by: Cocoa <i@uwucocoa.moe> 
						
						
							
						
					 | 
					
						2024-11-20 15:30:51 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						d8c824c594
					 | 
					
						
						
							
							Formatting fixes (#1606)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-20 15:30:36 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Saanidhya
							
						 
					 | 
					
						
						
							
						
						cb431dfc9f
					 | 
					
						
						
							
							Adds 3D pooling (#1526)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-19 16:45:24 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						61d787726a
					 | 
					
						
						
							
							Fix view scalar bug segfault (#1603)
						
						
						
						
						
						
						
						* fix view scalar bug
* fix view scalar bug
* one more fix 
						
						
							
						
					 | 
					
						2024-11-19 10:54:05 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						5e89aace9b
					 | 
					
						
						
							
							Fix concatenate vmap (#1600)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-19 10:44:04 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						2af7e8a9a6
					 | 
					
						
						
							
							fix cmake version (#1601)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-19 08:45:05 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						2419edd5b2
					 | 
					
						
						
							
							Faster indexing math in a few kernels (#1589)
						
						
						
						
						
						
						
						* wip: faster compiled kernels
* faster general unary with uint specialization
* index type in compiled, unary, binary, ternary, copy
* fix jit
* jit fix
* specialize gather + scatter
* nit in docs 
						
						
							
						
					 | 
					
						2024-11-18 19:52:00 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						bf481e8e5d
					 | 
					
						
						
							
							Fix sibling leak (#1590)
						
						
						
						
						
						
						
						* add test
* fix + test
* fix fix 
						
						
							
						
					 | 
					
						2024-11-18 19:17:01 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						9d7fa6b8e6
					 | 
					
						
						
							
							Use osx deployment target to pick Metal version (#1595)
						
						
						
						
						
						
						
						* choose metal based on deployment target rather than system version
* nit
* unused compile def 
						
						
							
						
					 | 
					
						2024-11-18 19:16:49 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						073076ac7d
					 | 
					
						
						
							
							2-Pass Sdpa Inference Kernel (#1597)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-18 17:31:53 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						9bd03dd9b4
					 | 
					
						
						
							
							More buffer donation with no-ops (#1591)
						
						
						
						
						
						
						
						* more donation
* fix test
* fix build 
						
						
							
						
					 | 
					
						2024-11-18 08:35:41 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						6931f84412
					 | 
					
						
						
							
							fix dispatch threads for a few kernels (#1594)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-18 08:35:25 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								xnorai
							
						 
					 | 
					
						
						
							
						
						16ec0556a0
					 | 
					
						
						
							
							Allocate raw JSON metadata buffer on the heap, and limit its size (#1596)
						
						
						
						
						
						
						
						* Allocate raw JSON metadata buffer on the heap, and limit its size to 1GiB
* Set the upper size limit for the header to 100K as in Rust safetensors 
						
						
							
						
					 | 
					
						2024-11-18 07:22:51 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						610af352d4
					 | 
					
						
						
							
							Dispatch bf16 at run time when using the JIT (#1584)
						
						
						
						
						
						
						
						* Dispatch bf16 at run time when using the JIT
* fix extension
* fix extension build
* fix extension build
* Update utils.h 
						
						
							
						
					 | 
					
						2024-11-15 16:54:36 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						b35f1e3c9c
					 | 
					
						
						
							
							fix donation in sdpa (#1587)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-13 17:21:13 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						dfa0b9aab4
					 | 
					
						
						
							
							Cpu fast quantize (#1578)
						
						
						
						
						
						
						
						* cpu quantize
* fix 
						
						
							
						
					 | 
					
						2024-11-08 20:10:39 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						a4c47b0276
					 | 
					
						
						
							
							OOB QMV fix (#1579)
						
						
						
						
						
						
						
						* fix oob access in qmv
* skip more
* fix small case 
						
						
							
						
					 | 
					
						2024-11-08 17:59:45 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						111fefd5e9
					 | 
					
						
						
							
							Fix OOB access in qmv (#1577)
						
						
						
						
						
						
						
						* fix oob access in qmv
* skip more 
						
						
							
						
					 | 
					
						2024-11-08 15:41:30 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						c1fe1ef081
					 | 
					
						
						
							
							Bfs width limit (#1568)
						
						
						
						
						
						
						
						* width limit
* fix
* large limit
* put env vars in env namespace 
						
						
							
						
					 | 
					
						2024-11-08 15:00:46 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						8c34c9dac4
					 | 
					
						
						
							
							throw for invalid case and remove test (#1575)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-08 12:04:03 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						91c0277356
					 | 
					
						
						
							
							fix per-example mask + docs in sdpa (#1574)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-08 11:51:15 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						9f0d5c12fc
					 | 
					
						
						
							
							Fully wrap the command encoder (#1572)
						
						
						
						
						
						
						
						* fully wrap the command encoder
* use consistent style + fix extensions 
						
						
							
						
					 | 
					
						2024-11-08 11:50:21 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						59247c2b62
					 | 
					
						
						
							
							add groups in conv2d (#1569)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-07 13:57:53 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						9a3842a2d9
					 | 
					
						
						
							
							fix (#1566)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-06 17:10:33 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						726dbd9267
					 | 
					
						
						
							
							v0.20.0 (#1565)
						
						
						
						
						
						
							
 v0.20.0
						
					 | 
					
						2024-11-05 12:37:57 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						54f05e7195
					 | 
					
						
						
							
							Fix gather vmap (#1563)
						
						
						
						
						
						
						
						* fix gather
* fix 
						
						
							
						
					 | 
					
						2024-11-05 11:29:20 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Alex Barron
							
						 
					 | 
					
						
						
							
						
						26be608470
					 | 
					
						
						
							
							Add split_k qvm for long context (#1564)
						
						
						
						
						
						
						
						* Add splitk qvm
* configurable splitk
* tuning
* remove extra instantiation
* remove refactor
* separate test
* cpu tolerance 
						
						
							
						
					 | 
					
						2024-11-05 11:25:19 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Angelos Katharopoulos
							
						 
					 | 
					
						
						
							
						
						248431eb3c
					 | 
					
						
						
							
							Reductions update (#1351)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-04 22:25:16 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						76f275b4df
					 | 
					
						
						
							
							error in rms for wrong size (#1562)
						
						
						
						
						
						
							
						
					 | 
					
						2024-11-04 13:24:02 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Awni Hannun
							
						 
					 | 
					
						
						
							
						
						f1951d6cce
					 | 
					
						
						
							
							Use fewer barriers (#1561)
						
						
						
						
						
						
						
						* use fewer barriers
* comment 
						
						
							
						
					 | 
					
						2024-11-04 10:26:49 -08:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 |