mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-06-25 01:41:19 +08:00

Author	SHA1	Message	Date
Awni Hannun	ee60e2a9d5	Kv cache (#643 ) * in place kv_cache * fix * fix kv cache size * partially fix kv cache dtype * step kv cache * multiple of step size * more teests + kv cache * more kv cache * udpate all models to use kv cache	2024-05-08 08:18:13 -07:00
Awni Hannun	b8a348c1b8	Switch to fast RMS/LN Norm (#603 ) * use nn.RMSNorm, use sdpa, cleanup * bump mlx versions * minor update * use fast layer norm * version bump * update requirement for whisper * update requirement for gguf	2024-03-23 07:13:51 -07:00
Awni Hannun	e4b19bb9e1	Make attention faster for a some models (#574 ) * make attention faster for a couple models * remove unused generation flags * add comment on lora * include text files as well	2024-03-14 21:35:54 -07:00
Awni Hannun	7cdd1b69ac	Enable unit testing in Circle and start some MLX LM tests (#545 ) * add a few tests for mlx lm * add a few tests for mlx lm * add a few tests for mlx lm * more tests / cleanup	2024-03-07 09:31:57 -08:00
Awni Hannun	f24edfa9dc	[mlx-lm] Add precompiled normalizations (#451 ) * add precompiled normalizations * nits	2024-02-22 12:40:55 -08:00
Awni Hannun	8fd953ee2b	Support for slerp merging models (#455 ) * support for slerp merging models * docs * update docs * format'	2024-02-19 20:37:15 -08:00
Angelos Katharopoulos	f71e965d57	Change gqa to use repeat instead of concatenate (#443 )	2024-02-14 17:40:11 -08:00
Awni Hannun	d4666615bb	Lazy import + refactor Lora layer addition (#426 ) * lazy model import in mlx_lm * change lora loading * fix olmo lora * remove a bunch of unused stuff from plamo * move phixtral to mlx-lm and out of llms/	2024-02-12 10:51:02 -08:00

8 Commits