mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-07-15 06:41:13 +08:00

History

Muhtasham Oblokulov 81e2a80026 Add Starcoder 2 (#502 ) * Add Starcoder2 model and update utils.py * Refactor model arguments and modules in starcoder2.py * Refactor FeedForward class to MLP in starcoder2.py * Fix typo * pre-commit * Refactor starcoder2.py: Update model arguments and modules * Fix LM head and MLP layers * Rename input layer norm * Update bias in linear layers * Refactor token embeddings in Starcoder2Model * Rename to standard HF attention layer name * Add LayerNorm * Add transposed token embeddings (like in Gemma) * Refactor MLP and TransformerBlock classes * Add tie_word_embeddings option to ModelArgs and update Model implementation * Add conditional check for tying word embeddings in Starcoder2Model * Fix bias in lm_head linear layer * Remove unused LayerNorm in stablelm * Update transformers dependency to use GitHub repository * fix lm head bug, revert transformer req * Update RoPE initialization in Attention class --------- Co-authored-by: Awni Hannun <awni@apple.com>		2024-03-02 19:39:23 -08:00
..
__init__.py	Mlx llm package (#301 )	2024-01-12 10:25:56 -08:00
base.py	Mlx llm package (#301 )	2024-01-12 10:25:56 -08:00
gemma.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
layers.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
llama.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
mixtral.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
olmo.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
phi.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
phixtral.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
plamo.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
qwen2.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
qwen.py	[mlx-lm] Add precompiled normalizations (#451 )	2024-02-22 12:40:55 -08:00
stablelm.py	Add Starcoder 2 (#502 )	2024-03-02 19:39:23 -08:00
starcoder2.py	Add Starcoder 2 (#502 )	2024-03-02 19:39:23 -08:00