Angelos Katharopoulos
8db7161c94
Bug fix in quantize ( #1054 )
2024-04-29 20:55:04 -07:00
Angelos Katharopoulos
ec8578d41a
Fix quantization of all 0s ( #1028 )
2024-04-24 00:40:42 -07:00
Angelos Katharopoulos
84d61d27aa
Make sure 0 is represented in the quantization ( #1016 )
2024-04-19 19:47:26 -07:00
Awni Hannun
039da779d1
No quant reshape ( #957 )
...
* precise option on cpu
* remove print
* remove reshape in quant matmul
* no quant reshape
2024-04-04 11:52:12 -07:00
Angelos Katharopoulos
5f9ba3019f
Fix qmm_t for unaligned cases ( #923 )
2024-03-28 15:34:57 -07:00
Angelos Katharopoulos
40c108766b
Quantized matmul fix ( #677 )
...
* Fix qmv for small or unaligned matrices
* Fix qmm
2024-02-12 18:54:21 -08:00
Awni Hannun
7a34e46677
Quantize with groups of 32 ( #511 )
...
* allow quantize with group sizes of 32
* missing cpu dispatch
* remove print
* Fix qvm for group_size 32
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-01-21 06:19:05 -08:00
Angelos Katharopoulos
c15fe3e61b
Allow arbitrary first dimension in quantization kernels. ( #458 )
...
* Allow arbitrary first dim on qmm_t and qmv
* Allow arbitrary first dim on qmm and qvm
* Specialized aligned vs unaligned case
* Add more checks for valid quantizations
2024-01-16 00:46:21 -08:00
Angelos Katharopoulos
e7f5059fe4
Support for quantized matmul with w and w^T ( #349 )
...
* Add the metal qvm implementation
* Add qmm_n
* Add gradient wrt to input for quantized_matmul
2024-01-03 14:22:36 -08:00
Angelos Katharopoulos
447bc089b9
Fix tolerance in de-/quantization test ( #295 )
2023-12-26 19:21:05 -08:00
Angelos Katharopoulos
b3916cbf2b
Improve names of quantization arguments ( #235 )
...
* Change the default quantization group_size to 64
* Rename groups to group_size and width to bits
2023-12-20 16:53:53 -08:00
Angelos Katharopoulos
57fe918cf8
Adds C++ and nn quantization utilities ( #230 )
...
* Add C++ de-/quantize ops
* Add quantize functions to the docs and tests
* Add a QuantizedLinear module
2023-12-20 14:17:38 -08:00
Angelos Katharopoulos
dfa9f4bc58
An initial quantized matmul implementation ( #205 )
...
* Add quantized matvec
* Add quantized matrix matrix with 2nd matrix transposed
* Add quantized matmul tests
* Add a slow cpu quantized matmul
* Add a slightly faster vectorized cpu version
2023-12-18 23:18:57 -08:00