mlx-examples/llms/mlx_lm
Muhtasham Oblokulov 81e2a80026
Add Starcoder 2 (#502)
* Add Starcoder2 model and update utils.py

* Refactor model arguments and modules in starcoder2.py

* Refactor FeedForward class to MLP in starcoder2.py

* Fix typo

* pre-commit

* Refactor starcoder2.py: Update model arguments and modules

* Fix LM head and MLP layers

* Rename  input layer norm

* Update bias in linear layers

* Refactor token embeddings in Starcoder2Model

* Rename to standard HF attention layer name

* Add LayerNorm

* Add transposed token embeddings (like in Gemma)

* Refactor MLP and TransformerBlock classes

* Add tie_word_embeddings option to ModelArgs and update Model implementation

* Add conditional check for tying word embeddings in Starcoder2Model

* Fix bias in lm_head linear layer

* Remove unused LayerNorm in stablelm

* Update transformers dependency to use GitHub repository

* fix lm head bug, revert transformer req

* Update RoPE initialization in Attention class

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-02 19:39:23 -08:00
..
examples Support for slerp merging models (#455) 2024-02-19 20:37:15 -08:00
models Add Starcoder 2 (#502) 2024-03-02 19:39:23 -08:00
tuner Add Starcoder 2 (#502) 2024-03-02 19:39:23 -08:00
__init__.py Fix import warning (#479) 2024-02-27 08:47:56 -08:00
convert.py Fix import warning (#479) 2024-02-27 08:47:56 -08:00
fuse.py feat(mlx-lm): add de-quant for fuse.py (#365) 2024-01-25 18:59:32 -08:00
generate.py chore(mlx-lm): add adapter support in generate.py (#494) 2024-02-28 07:49:25 -08:00
LORA.md chore(mlx-lm): add adapter support in generate.py (#494) 2024-02-28 07:49:25 -08:00
lora.py chore(mlx-lm): add adapter support in generate.py (#494) 2024-02-28 07:49:25 -08:00
MERGE.md Support for slerp merging models (#455) 2024-02-19 20:37:15 -08:00
merge.py Prevent llms/mlx_lm from serving the local directory as a webserver (#498) 2024-02-27 19:40:42 -08:00
py.typed Add py.typed to support PEP-561 (type-hinting) (#389) 2024-01-30 21:17:38 -08:00
README.md feat: move lora into mlx-lm (#337) 2024-01-23 08:44:37 -08:00
requirements.txt [mlx-lm] Add precompiled normalizations (#451) 2024-02-22 12:40:55 -08:00
SERVER.md Prevent llms/mlx_lm from serving the local directory as a webserver (#498) 2024-02-27 19:40:42 -08:00
server.py Prevent llms/mlx_lm from serving the local directory as a webserver (#498) 2024-02-27 19:40:42 -08:00
UPLOAD.md Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
utils.py llms: convert() add 'revision' argument (#506) 2024-03-02 06:28:26 -08:00
version.py Fix import warning (#479) 2024-02-27 08:47:56 -08:00

Generate Text with MLX and 🤗 Hugging Face

This an example of large language model text generation that can pull models from the Hugging Face Hub.

For more information on this example, see the README in the parent directory.

This package also supports fine tuning with LoRA or QLoRA. For more information see the LoRA documentation.