mirror of
https://github.com/ml-explore/mlx.git
synced 2025-09-23 05:58:09 +08:00

* support disable metal buffer cache, due to large unused memory buffered when llm generated long context tokens * Run format and add "cache_enabled" feature tests