Alex Barron
85ffd2c96a
Quantized KV Cache (#1075)
* add QuantizedKVCache
* simplify
* add tests
* single sdpa function
* fix sed
* in place
* fix tests
* support different k and v head dims
2024-10-31 16:59:52 -07:00
..
2024-01-12 10:25:56 -08:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-22 09:56:45 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-22 15:44:08 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-07 20:45:51 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-10-31 16:59:52 -07:00
2024-08-24 06:52:33 -07:00
2024-08-16 15:28:39 -07:00