support dora finetune in mlx-examples/llms/mlx_lm (#779)

* support dora finetune * solve problems in lora.py and tuner.utils.py * add use_dora (bool) in functions of load adapters * delete all unsupported quantization code and fix all the calculate problems in mlx_lm/tuner/dora.py * Using stop_gradient to prevent gradients from flowing through ‘norm’ during backpropagation * set DEFAULT_USE_DORA in mlx_lm/generate.py * add annotation for all the use_dora * mlx_lm/fuse.py support fuse dora layers and fix a bug of to_linear() in mlx_lm/tuner/dora.py * simplify code of juding type of a fused layer in mlx_lm/fuse.py * add use_dora in mlx_lm/fuse.py when apply_lora_layers() * style + nits * style + nits * more updates --------- Co-authored-by: chenyifei08 <chenyifei08@baidu.com> Co-authored-by: Awni Hannun <awni@apple.com>
2025-09-01 12:49:50 +08:00 · 2024-05-16 23:21:26 +08:00
parent 69181e0058
commit 42458914c8
7 changed files with 147 additions and 19 deletions
--- a/llms/mlx_lm/generate.py
+++ b/llms/mlx_lm/generate.py
@@ -123,7 +123,9 @@ def main():
        tokenizer_config["eos_token"] = args.eos_token

    model, tokenizer = load(
-        args.model, adapter_path=args.adapter_path, tokenizer_config=tokenizer_config
+        args.model,
+        adapter_path=args.adapter_path,
+        tokenizer_config=tokenizer_config,
    )

    if args.use_default_chat_template: