* Adding full model weights finetuning
* Updating the LORA.md and ACKNOWLEDGMENTS.md files.
* removing --use-dora and --fulll-training and adding --fine-tune-type
* some clean up
* reformating and fixing dora training
* updated CONFIG_DEFAULTS
* update config example
* update in the config example fie
* Update LORA.md
* merge and commit
* adding argument for dora linear layer
* clean up
* clean up in the example yaml file
* fix
* final fix before sending
* small addition to re md file
* fix for loading the fully trained model by saving all the files and configs correctly
* clean up
* removing the unnesesairy files
* changing lora layers back to 16
* removed max file size
* nits
* resolve merge
* some consistency changes
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* feature: LoRA adapter for Embeddings
* feature: wire in LoRAEmbedding into the tuner. Allow the embedding and non model.layers Linear layers to be targeted for fine tuning
* feature: DoRA adapter for Embeddings
* feature: wire in DoRAEmbedding
* bugfix: ensure self.m is recalculated when the linear layer is changed in DoRALinear.from_linear
* refactor: prefer from_base over from_linear or from_embedding. prefer fuse over to_linear or to_embedding
* cleanup: remove unused imports in test_dora.py
* refactor: remove unnecessary non_layer_modules
* cleanup: remove wrong comments for lora embedding dropout. remove uncessary parens in dora embedding dropout
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* support dora finetune
* solve problems in lora.py and tuner.utils.py
* add use_dora (bool) in functions of load adapters
* delete all unsupported quantization code and fix all the calculate problems in mlx_lm/tuner/dora.py
* Using stop_gradient to prevent gradients from flowing through ‘norm’ during backpropagation
* set DEFAULT_USE_DORA in mlx_lm/generate.py
* add annotation for all the use_dora
* mlx_lm/fuse.py support fuse dora layers and fix a bug of to_linear() in mlx_lm/tuner/dora.py
* simplify code of juding type of a fused layer in mlx_lm/fuse.py
* add use_dora in mlx_lm/fuse.py when apply_lora_layers()
* style + nits
* style + nits
* more updates
---------
Co-authored-by: chenyifei08 <chenyifei08@baidu.com>
Co-authored-by: Awni Hannun <awni@apple.com>
* wip
* wip
* feat: convert mlx model to gguf f16
* chore: conver norm layer to float32 to avoid overflow issue
* chore: add support for mixtral
* chore: clean up
* chore: remove unused import statement
* chore: clean up weight name mapping
* version and readme
* actual version bump
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* feat(mlx-lm): add de-quant for fuse
* chore: disable quant in to linear when de-quant enabled
* chore: add better error handling for adapter file not found