Port of phi3small (#794)

* start port of phi3small * fix phi3 * use block sparsity * compile activation * nits in readme / mlx lm version
2025-12-16 02:08:55 +08:00 · 2024-05-31 12:54:14 -07:00
parent 09aaeac72c
commit 81318ad4a8
5 changed files with 326 additions and 8 deletions
--- a/llms/mlx_lm/LORA.md
+++ b/llms/mlx_lm/LORA.md
@@ -129,7 +129,7 @@ For example, to fuse and upload a model derived from Mistral-7B-v0.1, run:
 ```shell
 mlx_lm.fuse \
    --model mistralai/Mistral-7B-v0.1 \
-    --upload-repo mlx-community/my-4bit-lora-mistral \
+    --upload-repo mlx-community/my-lora-mistral-7b \
    --hf-path mistralai/Mistral-7B-v0.1
 ```

@@ -249,7 +249,7 @@ of memory. Here are some tips to reduce memory use should you need to do so:
 For example, for a machine with 32 GB the following should run reasonably fast:

 ```
-python lora.py \
+mlx_lm.lora \
    --model mistralai/Mistral-7B-v0.1 \
    --train \
    --batch-size 1 \