Add Starcoder 2 (#502)

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-09 13:28:53 +08:00

* Add Starcoder2 model and update utils.py

* Refactor model arguments and modules in starcoder2.py

* Refactor FeedForward class to MLP in starcoder2.py

* Fix typo

* pre-commit

* Refactor starcoder2.py: Update model arguments and modules

* Fix LM head and MLP layers

* Rename  input layer norm

* Update bias in linear layers

* Refactor token embeddings in Starcoder2Model

* Rename to standard HF attention layer name

* Add LayerNorm

* Add transposed token embeddings (like in Gemma)

* Refactor MLP and TransformerBlock classes

* Add tie_word_embeddings option to ModelArgs and update Model implementation

* Add conditional check for tying word embeddings in Starcoder2Model

* Fix bias in lm_head linear layer

* Remove unused LayerNorm in stablelm

* Update transformers dependency to use GitHub repository

* fix lm head bug, revert transformer req

* Update RoPE initialization in Attention class

---------

Co-authored-by: Awni Hannun <awni@apple.com>

This commit is contained in:

Muhtasham Oblokulov

2024-03-03 04:39:23 +01:00

committed by

GitHub

parent 5b1043a458

commit 81e2a80026

4 changed files with 191 additions and 2 deletions

									
										1

llms/mlx_lm/tuner/utils.py
									
												View File
												
				@@ -32,6 +32,7 @@ def linear_to_lora_layers(model: nn.Module, num_lora_layers: int):

				        "stablelm",

				        "qwen2",

				        "gemma",

				        "starcoder2",

				    ]:

				        check_lora_layers(len(model.model.layers))

Add Starcoder 2 (#502)

1 llms/mlx_lm/tuner/utils.py Unescape Escape View File

1

llms/mlx_lm/tuner/utils.py

View File