mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-06-25 01:41:19 +08:00

Author	SHA1	Message	Date
Awni Hannun	ee60e2a9d5	Kv cache (#643 ) * in place kv_cache * fix * fix kv cache size * partially fix kv cache dtype * step kv cache * multiple of step size * more teests + kv cache * more kv cache * udpate all models to use kv cache	2024-05-08 08:18:13 -07:00
Awni Hannun	c68aa3c7c3	Stable lm 2 (#666 ) * stable lm 2 * test and lora * version bump * merge stable models	2024-04-08 14:18:55 -07:00
Awni Hannun	b8a348c1b8	Switch to fast RMS/LN Norm (#603 ) * use nn.RMSNorm, use sdpa, cleanup * bump mlx versions * minor update * use fast layer norm * version bump * update requirement for whisper * update requirement for gguf	2024-03-23 07:13:51 -07:00
Awni Hannun	8b05bb6d18	[mlx-lm] Use sdpa in llama / mistral model (#515 ) * use sdpa * update a few more models * version * fix stablelm type	2024-03-07 17:41:23 -08:00
Awni Hannun	7cdd1b69ac	Enable unit testing in Circle and start some MLX LM tests (#545 ) * add a few tests for mlx lm * add a few tests for mlx lm * add a few tests for mlx lm * more tests / cleanup	2024-03-07 09:31:57 -08:00
Muhtasham Oblokulov	81e2a80026	Add Starcoder 2 (#502 ) * Add Starcoder2 model and update utils.py * Refactor model arguments and modules in starcoder2.py * Refactor FeedForward class to MLP in starcoder2.py * Fix typo * pre-commit * Refactor starcoder2.py: Update model arguments and modules * Fix LM head and MLP layers * Rename input layer norm * Update bias in linear layers * Refactor token embeddings in Starcoder2Model * Rename to standard HF attention layer name * Add LayerNorm * Add transposed token embeddings (like in Gemma) * Refactor MLP and TransformerBlock classes * Add tie_word_embeddings option to ModelArgs and update Model implementation * Add conditional check for tying word embeddings in Starcoder2Model * Fix bias in lm_head linear layer * Remove unused LayerNorm in stablelm * Update transformers dependency to use GitHub repository * fix lm head bug, revert transformer req * Update RoPE initialization in Attention class --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-02 19:39:23 -08:00
Ashish	261f1280f6	Update to StableLM code (#514 ) * StableLM now part of Transformers as stablelm rather than stablelm_epoch; changed config to match new changes * removing old file * reference new stablelm	2024-03-01 09:53:38 -08:00

7 Commits