Add tips on porting LLMs from HuggingFace (#523)

* Add tips on porting LLMs from HuggingFace

* Add CONTRIBUTING.md  to mlx-examples-llms

* Refactor imports and update comment in starcoder2.py

* Update llms/mlx_lm/models/starcoder2.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* nits

* nits

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
This commit is contained in:
Muhtasham Oblokulov 2024-03-06 02:43:15 +01:00 committed by GitHub
parent 3fdf85e79d
commit 5de7c2ac33
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 42 additions and 5 deletions

View File

@ -14,11 +14,11 @@ possible.
You can also run the formatters manually as follows:
```
```bash
clang-format -i file.cpp
```
```
```bash
black file.py
```

38
llms/CONTRIBUTING.md Normal file
View File

@ -0,0 +1,38 @@
# Contributing to MLX LM
Below are some tips to port LLMs available on Hugging Face to MLX.
Before starting checkout the [general contribution
guidelines](https://github.com/ml-explore/mlx-examples/blob/main/CONTRIBUTING.md).
Next, from this directory, do an editable install:
```shell
pip install -e .
```
Then check if the model has weights in the
[safetensors](https://huggingface.co/docs/safetensors/index) format. If not
[follow instructions](https://huggingface.co/spaces/safetensors/convert) to
convert it.
After that, add the model file to the
[`mlx_lm/models`](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm/models)
directory. You can see other examples there. We recommend starting from a model
that is similar to the model you are porting.
Make sure the name of the new model file is the same as the `model_type` in the
`config.json`, for example
[starcoder2](https://huggingface.co/bigcode/starcoder2-7b/blob/main/config.json#L17).
To determine the model layer names, we suggest either:
- Refer to the Transformers implementation if you are familiar with the
codebase.
- Load the model weights and check the weight names which will tell you about
the model structure.
- Look at the names of the weights by inspecting `model.safetensors.index.json`
in the Hugging Face repo.
To add LoRA support edit
[`mlx_lm/tuner/utils.py`](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/tuner/utils.py#L27-L60)

View File

@ -1,6 +1,5 @@
import math
from dataclasses import dataclass
from typing import Dict, Optional, Tuple, Union
from typing import Optional, Tuple
import mlx.core as mx
import mlx.nn as nn
@ -158,7 +157,7 @@ class Model(nn.Module):
super().__init__()
self.model_type = args.model_type
self.model = Starcoder2Model(args)
# This is for 15B starcoder2 since it doesn't tie word embeddings
# For 15B starcoder2 and fine-tuned models which don't tie word embeddings
if not args.tie_word_embeddings:
self.lm_head = nn.Linear(args.hidden_size, args.vocab_size, bias=False)