Add tips on porting LLMs from HuggingFace (#523)

* Add tips on porting LLMs from HuggingFace

* Add CONTRIBUTING.md  to mlx-examples-llms

* Refactor imports and update comment in starcoder2.py

* Update llms/mlx_lm/models/starcoder2.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* nits

* nits

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
This commit is contained in:
Muhtasham Oblokulov 2024-03-06 02:43:15 +01:00 committed by GitHub
parent 3fdf85e79d
commit 5de7c2ac33
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 42 additions and 5 deletions

View File

@ -14,11 +14,11 @@ possible.
You can also run the formatters manually as follows: You can also run the formatters manually as follows:
``` ```bash
clang-format -i file.cpp clang-format -i file.cpp
``` ```
``` ```bash
black file.py black file.py
``` ```

38
llms/CONTRIBUTING.md Normal file
View File

@ -0,0 +1,38 @@
# Contributing to MLX LM
Below are some tips to port LLMs available on Hugging Face to MLX.
Before starting checkout the [general contribution
guidelines](https://github.com/ml-explore/mlx-examples/blob/main/CONTRIBUTING.md).
Next, from this directory, do an editable install:
```shell
pip install -e .
```
Then check if the model has weights in the
[safetensors](https://huggingface.co/docs/safetensors/index) format. If not
[follow instructions](https://huggingface.co/spaces/safetensors/convert) to
convert it.
After that, add the model file to the
[`mlx_lm/models`](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm/models)
directory. You can see other examples there. We recommend starting from a model
that is similar to the model you are porting.
Make sure the name of the new model file is the same as the `model_type` in the
`config.json`, for example
[starcoder2](https://huggingface.co/bigcode/starcoder2-7b/blob/main/config.json#L17).
To determine the model layer names, we suggest either:
- Refer to the Transformers implementation if you are familiar with the
codebase.
- Load the model weights and check the weight names which will tell you about
the model structure.
- Look at the names of the weights by inspecting `model.safetensors.index.json`
in the Hugging Face repo.
To add LoRA support edit
[`mlx_lm/tuner/utils.py`](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/tuner/utils.py#L27-L60)

View File

@ -1,6 +1,5 @@
import math
from dataclasses import dataclass from dataclasses import dataclass
from typing import Dict, Optional, Tuple, Union from typing import Optional, Tuple
import mlx.core as mx import mlx.core as mx
import mlx.nn as nn import mlx.nn as nn
@ -158,7 +157,7 @@ class Model(nn.Module):
super().__init__() super().__init__()
self.model_type = args.model_type self.model_type = args.model_type
self.model = Starcoder2Model(args) self.model = Starcoder2Model(args)
# This is for 15B starcoder2 since it doesn't tie word embeddings # For 15B starcoder2 and fine-tuned models which don't tie word embeddings
if not args.tie_word_embeddings: if not args.tie_word_embeddings:
self.lm_head = nn.Linear(args.hidden_size, args.vocab_size, bias=False) self.lm_head = nn.Linear(args.hidden_size, args.vocab_size, bias=False)