mirror of
https://github.com/ml-explore/mlx.git
synced 2025-09-24 06:28:08 +08:00

* support disable metal buffer cache, due to large unused memory buffered when llm generated long context tokens * Run format and add "cache_enabled" feature tests