mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-06-24 17:31:18 +08:00

Author	SHA1	Message	Date
Awni Hannun	ee60e2a9d5	Kv cache (#643 ) * in place kv_cache * fix * fix kv cache size * partially fix kv cache dtype * step kv cache * multiple of step size * more teests + kv cache * more kv cache * udpate all models to use kv cache	2024-05-08 08:18:13 -07:00
Awni Hannun	2146bcd7ee	Quantize embedding / Update quantize API (#680 ) * more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test	2024-04-18 18:16:10 -07:00
Awni Hannun	b8a348c1b8	Switch to fast RMS/LN Norm (#603 ) * use nn.RMSNorm, use sdpa, cleanup * bump mlx versions * minor update * use fast layer norm * version bump * update requirement for whisper * update requirement for gguf	2024-03-23 07:13:51 -07:00
Anchen	3535408c99	chore(mlx-lm): fix tie_word_embeddings for qwen2 (#566 ) * chore: fix tie_word_embeddings for qwen2 * chore: default tie_word_embeddings to True	2024-03-12 21:34:32 -07:00
Awni Hannun	8b05bb6d18	[mlx-lm] Use sdpa in llama / mistral model (#515 ) * use sdpa * update a few more models * version * fix stablelm type	2024-03-07 17:41:23 -08:00
Anchen	8a178f8716	chore: enable tie_word_embeddings config for qwen2 (#544 )	2024-03-07 06:11:35 -08:00
Awni Hannun	f24edfa9dc	[mlx-lm] Add precompiled normalizations (#451 ) * add precompiled normalizations * nits	2024-02-22 12:40:55 -08:00
Awni Hannun	8fd953ee2b	Support for slerp merging models (#455 ) * support for slerp merging models * docs * update docs * format'	2024-02-19 20:37:15 -08:00
Angelos Katharopoulos	f71e965d57	Change gqa to use repeat instead of concatenate (#443 )	2024-02-14 17:40:11 -08:00
Awni Hannun	06ddb8414d	Fix Qwen2 and SD (#441 ) * fix qwen2 * version bump * fix list shape	2024-02-14 13:43:12 -08:00
Awni Hannun	d4666615bb	Lazy import + refactor Lora layer addition (#426 ) * lazy model import in mlx_lm * change lora loading * fix olmo lora * remove a bunch of unused stuff from plamo * move phixtral to mlx-lm and out of llms/	2024-02-12 10:51:02 -08:00
Junyang Lin	9d0dd34403	add qwen2 (#411 )	2024-02-04 08:31:38 -08:00

12 Commits