Angelos Katharopoulos
|
447bc089b9
|
Fix tolerance in de-/quantization test (#295)
|
2023-12-26 19:21:05 -08:00 |
|
Angelos Katharopoulos
|
b3916cbf2b
|
Improve names of quantization arguments (#235)
* Change the default quantization group_size to 64
* Rename groups to group_size and width to bits
|
2023-12-20 16:53:53 -08:00 |
|
Angelos Katharopoulos
|
57fe918cf8
|
Adds C++ and nn quantization utilities (#230)
* Add C++ de-/quantize ops
* Add quantize functions to the docs and tests
* Add a QuantizedLinear module
|
2023-12-20 14:17:38 -08:00 |
|
Angelos Katharopoulos
|
dfa9f4bc58
|
An initial quantized matmul implementation (#205)
* Add quantized matvec
* Add quantized matrix matrix with 2nd matrix transposed
* Add quantized matmul tests
* Add a slow cpu quantized matmul
* Add a slightly faster vectorized cpu version
|
2023-12-18 23:18:57 -08:00 |
|