Add tips on porting LLMs from HuggingFace (#523)

* Add tips on porting LLMs from HuggingFace * Add CONTRIBUTING.md to mlx-examples-llms * Refactor imports and update comment in starcoder2.py * Update llms/mlx_lm/models/starcoder2.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * nits * nits --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com> Co-authored-by: Awni Hannun <awni@apple.com>
2025-08-09 10:26:38 +08:00 · 2024-03-06 02:43:15 +01:00 · 2024-03-06 02:43:15 +01:00 · 5de7c2ac33
commit 5de7c2ac33
parent 3fdf85e79d
3 changed files with 42 additions and 5 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -14,11 +14,11 @@ possible.
 
   You can also run the formatters manually as follows:
 
-     ```
+     ```bash
     clang-format -i file.cpp
     ```
 
-     ```
+     ```bash
     black file.py
     ```
 
--- a/llms/CONTRIBUTING.md
+++ b/llms/CONTRIBUTING.md
@ -0,0 +1,38 @@
+# Contributing to MLX LM 
+
+Below are some tips to port LLMs available on Hugging Face to MLX.
+
+Before starting checkout the [general contribution
+guidelines](https://github.com/ml-explore/mlx-examples/blob/main/CONTRIBUTING.md).
+
+Next, from this directory, do an editable install:
+
+```shell
+pip install -e .
+```
+
+Then check if the model has weights in the
+[safetensors](https://huggingface.co/docs/safetensors/index) format. If not
+[follow instructions](https://huggingface.co/spaces/safetensors/convert) to
+convert it.
+
+After that, add the model file to the
+[`mlx_lm/models`](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm/models)
+directory. You can see other examples there. We recommend starting from a model
+that is similar to the model you are porting.
+
+Make sure the name of the new model file is the same as the `model_type` in the
+`config.json`, for example
+[starcoder2](https://huggingface.co/bigcode/starcoder2-7b/blob/main/config.json#L17).
+
+To determine the model layer names, we suggest either:
+
+- Refer to the Transformers implementation if you are familiar with the
+  codebase.
+- Load the model weights and check the weight names which will tell you about
+  the model structure.
+- Look at the names of the weights by inspecting `model.safetensors.index.json`
+  in the Hugging Face repo.
+
+To add LoRA support edit
+[`mlx_lm/tuner/utils.py`](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/tuner/utils.py#L27-L60)
--- a/llms/mlx_lm/models/starcoder2.py
+++ b/llms/mlx_lm/models/starcoder2.py
@ -1,6 +1,5 @@
-import math
 from dataclasses import dataclass
-from typing import Dict, Optional, Tuple, Union
+from typing import Optional, Tuple

 import mlx.core as mx
 import mlx.nn as nn
@ -158,7 +157,7 @@ class Model(nn.Module):
        super().__init__()
        self.model_type = args.model_type
        self.model = Starcoder2Model(args)
-        # This is for 15B starcoder2 since it doesn't tie word embeddings
+        # For 15B starcoder2 and fine-tuned models which don't tie word embeddings
        if not args.tie_word_embeddings:
            self.lm_head = nn.Linear(args.hidden_size, args.vocab_size, bias=False)