Alex Barron 
							
						 
					 
					
						
						
							
						
						d15fa13daf 
					 
					
						
						
							
							Batched Quantized Matmul + Fast Small QMV ( #1503 )  
						
						 
						
						... 
						
						
						
						* add fast qmv for small dims
* fix test
* batched cpu
* add batched template param
* refactor metal quantized.cpp 
						
						
					 
					
						2024-10-21 16:23:17 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Alex Barron 
							
						 
					 
					
						
						
							
						
						c52d1600f0 
					 
					
						
						
							
							Fused Affine Quantize/Dequantize ops ( #1282 )  
						
						 
						
						... 
						
						
						
						* Add fast affine dequantize
* add full quantize kernel
* fused kernel with scale/bias computation
* fix docstring
* fix no jit error
* fix test
* test fix
* reduce fast api to only affine_quantize 
						
						
					 
					
						2024-07-29 15:11:38 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						d568c7ee36 
					 
					
						
						
							
							Rename block sparse ( #1149 )  
						
						 
						
						... 
						
						
						
						* block_sparse_mm to gather_mm
* rename
* nit
* nit 
						
						
					 
					
						2024-05-22 07:48:34 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						e78a6518fa 
					 
					
						
						
							
							Block sparse qmm ( #1124 )  
						
						 
						
						
						
						
					 
					
						2024-05-16 15:24:14 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						17f57df797 
					 
					
						
						
							
							Improvements in the quantizer and dequantization kernel ( #1061 )  
						
						 
						
						
						
						
					 
					
						2024-05-01 18:19:11 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						8db7161c94 
					 
					
						
						
							
							Bug fix in quantize ( #1054 )  
						
						 
						
						
						
						
					 
					
						2024-04-29 20:55:04 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						ec8578d41a 
					 
					
						
						
							
							Fix quantization of all 0s ( #1028 )  
						
						 
						
						
						
						
					 
					
						2024-04-24 00:40:42 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						84d61d27aa 
					 
					
						
						
							
							Make sure 0 is represented in the quantization ( #1016 )  
						
						 
						
						
						
						
					 
					
						2024-04-19 19:47:26 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						039da779d1 
					 
					
						
						
							
							No quant reshape ( #957 )  
						
						 
						
						... 
						
						
						
						* precise option on cpu
* remove print
* remove reshape in quant matmul
* no quant reshape 
						
						
					 
					
						2024-04-04 11:52:12 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						5f9ba3019f 
					 
					
						
						
							
							Fix qmm_t for unaligned cases ( #923 )  
						
						 
						
						
						
						
					 
					
						2024-03-28 15:34:57 -07:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						40c108766b 
					 
					
						
						
							
							Quantized matmul fix ( #677 )  
						
						 
						
						... 
						
						
						
						* Fix qmv for small or unaligned matrices
* Fix qmm 
						
						
					 
					
						2024-02-12 18:54:21 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Awni Hannun 
							
						 
					 
					
						
						
							
						
						7a34e46677 
					 
					
						
						
							
							Quantize with groups of 32 ( #511 )  
						
						 
						
						... 
						
						
						
						* allow quantize with group sizes of 32
* missing cpu dispatch
* remove print
* Fix qvm for group_size 32
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com > 
						
						
					 
					
						2024-01-21 06:19:05 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						c15fe3e61b 
					 
					
						
						
							
							Allow arbitrary first dimension in quantization kernels. ( #458 )  
						
						 
						
						... 
						
						
						
						* Allow arbitrary first dim on qmm_t and qmv
* Allow arbitrary first dim on qmm and qvm
* Specialized aligned vs unaligned case
* Add more checks for valid quantizations 
						
						
					 
					
						2024-01-16 00:46:21 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						e7f5059fe4 
					 
					
						
						
							
							Support for quantized matmul with w and w^T ( #349 )  
						
						 
						
						... 
						
						
						
						* Add the metal qvm implementation
* Add qmm_n
* Add gradient wrt to input for quantized_matmul 
						
						
					 
					
						2024-01-03 14:22:36 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						447bc089b9 
					 
					
						
						
							
							Fix tolerance in de-/quantization test ( #295 )  
						
						 
						
						
						
						
					 
					
						2023-12-26 19:21:05 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						b3916cbf2b 
					 
					
						
						
							
							Improve names of quantization arguments ( #235 )  
						
						 
						
						... 
						
						
						
						* Change the default quantization group_size to 64
* Rename groups to group_size and width to bits 
						
						
					 
					
						2023-12-20 16:53:53 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						57fe918cf8 
					 
					
						
						
							
							Adds C++ and nn quantization utilities ( #230 )  
						
						 
						
						... 
						
						
						
						* Add C++ de-/quantize ops
* Add quantize functions to the docs and tests
* Add a QuantizedLinear module 
						
						
					 
					
						2023-12-20 14:17:38 -08:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Angelos Katharopoulos 
							
						 
					 
					
						
						
							
						
						dfa9f4bc58 
					 
					
						
						
							
							An initial quantized matmul implementation ( #205 )  
						
						 
						
						... 
						
						
						
						* Add quantized matvec
* Add quantized matrix matrix with 2nd matrix transposed
* Add quantized matmul tests
* Add a slow cpu quantized matmul
* Add a slightly faster vectorized cpu version 
						
						
					 
					
						2023-12-18 23:18:57 -08:00