mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-08-09 18:36:38 +08:00

History

Awni Hannun fca087be49 More cache improvements (#1015 ) * fix rotating kv cache for chat use case * reorg + fixes to caching, unify prompt caching across types and use cases for e.g. caching during a chat * nit in chat * fix tests * fix tests * fix tests * docs * chat command * comments + docs * Define meta_state on all Cache implementations * fixes + trim_prompt_cache api * fix default model --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>		2024-10-07 20:45:51 -07:00
..
chat.py	More cache improvements (#1015 )	2024-10-07 20:45:51 -07:00
generate_response.py	More cache improvements (#1015 )	2024-10-07 20:45:51 -07:00
lora_config.yaml	Adding full finetuning (#903 )	2024-09-29 17:12:47 -07:00
merge_config.yaml	Support for slerp merging models (#455 )	2024-02-19 20:37:15 -08:00