mirror of
https://github.com/ml-explore/mlx.git
synced 2025-11-09 22:08:12 +08:00
* support disable metal buffer cache, due to large unused memory buffered when llm generated long context tokens * Run format and add "cache_enabled" feature tests