chore(mlx-lm): add adapter support in generate.py (#494)

* chore(mlx-lm): add adapter support in generate.py * chore: remove generate from lora.py and raise error to let user use mlx_lm.generate instead
2025-12-16 02:08:55 +08:00 · 2024-02-29 02:49:25 +11:00
parent ab0f1dd1b6
commit 13794a05da
3 changed files with 22 additions and 11 deletions
--- a/llms/mlx_lm/LORA.md
+++ b/llms/mlx_lm/LORA.md
@@ -72,6 +72,17 @@ python -m mlx_lm.lora \
    --test
 ```

+### Generate
+
+For generation use mlx_lm.generate:
+
+```shell
+python -m mlx_lm.generate \
+    --model <path_to_model> \
+    --adapter-file <path_to_adapters.npz> \
+    --prompt "<your_model_prompt>"
+```
+
 ## Fuse and Upload

 You can generate a model fused with the low-rank adapters using the