Added lora support for Phi-2 (#302)

* Added lora support for Phi-2 * Added Phi-2 support in fuse and convert * format + readme --------- Co-authored-by: Awni Hannun <awni@apple.com>
2025-12-16 02:08:55 +08:00 · 2024-01-12 13:45:30 -08:00
parent 3ac731dd4f
commit 7575125d5d
12 changed files with 564 additions and 25 deletions
--- a/lora/README.md
+++ b/lora/README.md
@@ -2,7 +2,7 @@

 This is an example of using MLX to fine-tune an LLM with low rank adaptation
 (LoRA) for a target task.[^lora] The example also supports quantized LoRA
-(QLoRA).[^qlora] The example works with Llama and Mistral style
+(QLoRA).[^qlora] The example works with Llama, Mistral, and Phi-2 style
 models available on Hugging Face.

 In this example we'll use the WikiSQL[^wikisql] dataset to train the LLM to
@@ -81,7 +81,7 @@ To fine-tune a model use:
 ```
 python lora.py --model <path_to_model> \
               --train \
-               --iters 600
+               --iters 600 \
 ```

 If `--model` points to a quantized model, then the training will use QLoRA,
@@ -100,7 +100,7 @@ To compute test set perplexity use:
 ```
 python lora.py --model <path_to_model> \
               --adapter-file <path_to_adapters.npz> \
-               --test 
+               --test \
 ```

 ### Generate
@@ -114,7 +114,7 @@ python lora.py --model <path_to_model> \
               --prompt "table: 1-10015132-16
 columns: Player, No., Nationality, Position, Years in Toronto, School/Club Team
 Q: What is terrence ross' nationality
-A: "
+A: " \
 ```

 ## Results
@@ -211,7 +211,7 @@ python lora.py \
   --model mistralai/Mistral-7B-v0.1 \
   --train \
   --batch-size 1 \
-   --lora-layers 4
+   --lora-layers 4 \
 ```

 The above command on an M1 Max with 32 GB runs at about 250 tokens-per-second.