Add Starcoder 2 (#502)

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

* Add Starcoder2 model and update utils.py

* Refactor model arguments and modules in starcoder2.py

* Refactor FeedForward class to MLP in starcoder2.py

* Fix typo

* pre-commit

* Refactor starcoder2.py: Update model arguments and modules

* Fix LM head and MLP layers

* Rename  input layer norm

* Update bias in linear layers

* Refactor token embeddings in Starcoder2Model

* Rename to standard HF attention layer name

* Add LayerNorm

* Add transposed token embeddings (like in Gemma)

* Refactor MLP and TransformerBlock classes

* Add tie_word_embeddings option to ModelArgs and update Model implementation

* Add conditional check for tying word embeddings in Starcoder2Model

* Fix bias in lm_head linear layer

* Remove unused LayerNorm in stablelm

* Update transformers dependency to use GitHub repository

* fix lm head bug, revert transformer req

* Update RoPE initialization in Attention class

---------

Co-authored-by: Awni Hannun <awni@apple.com>

This commit is contained in:

Muhtasham Oblokulov

2024-03-03 04:39:23 +01:00

committed by

GitHub

parent 5b1043a458

commit 81e2a80026

4 changed files with 191 additions and 2 deletions

									
										2

llms/README.md
									
												View File
												
				@@ -46,7 +46,7 @@ You can convert models in the Python API with:

				```python

				from mlx_lm import convert

				upload_repo = "mistralai/Mistral-7B-Instruct-v0.1"

				upload_repo = "mlx-community/My-Mistral-7B-v0.1-4bit"

				convert("mistralai/Mistral-7B-v0.1", quantize=True, upload_repo=upload_repo)

				```

Add Starcoder 2 (#502)

2 llms/README.md Unescape Escape View File

2

llms/README.md

View File