Files
mlx-examples/llms/mlx_lm/models
Alex Barron 85ffd2c96a Quantized KV Cache (#1075)
* add QuantizedKVCache

* simplify

* add tests

* single sdpa function

* fix sed

* in place

* fix tests

* support different k and v head dims
2024-10-31 16:59:52 -07:00
..
2024-01-12 10:25:56 -08:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-22 09:56:45 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-22 15:44:08 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-07 20:45:51 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-08-24 06:52:33 -07:00