* Generalize prompt_feature and completion_feature for use in local datasets to facilitate compatibility with many other training dataset formats.
* Persist configured prompt/completion key
* rebase + nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Adding full model weights finetuning
* Updating the LORA.md and ACKNOWLEDGMENTS.md files.
* removing --use-dora and --fulll-training and adding --fine-tune-type
* some clean up
* reformating and fixing dora training
* updated CONFIG_DEFAULTS
* update config example
* update in the config example fie
* Update LORA.md
* merge and commit
* adding argument for dora linear layer
* clean up
* clean up in the example yaml file
* fix
* final fix before sending
* small addition to re md file
* fix for loading the fully trained model by saving all the files and configs correctly
* clean up
* removing the unnesesairy files
* changing lora layers back to 16
* removed max file size
* nits
* resolve merge
* some consistency changes
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* LoRA: support fine-tuning tools datasets
* LoRA: Split small function
* LoRA: add tools format to lora docs
* LoRA: pre-commit fix
* Revert "LoRA: pre-commit fix"
This reverts commit b94b7e0fe7.
* Revert "LoRA: Split small function"
This reverts commit 3f6a5f19fd.
* LoRA: remove ToolsDataset
In a JSONL file, not all data is required to include the tools value.
* nit in readme
* nit in readme
* nit in readme
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Add hf_dataset configuration for using HF hub-hosted datasets for (Q)LoRA training
* Pre-commit formatting
* Fix YAML config example
* Print DS info
* Include name
* Add hf_dataset parameter default
* Remove TextHFDataset and CompletionsHFDataset and use Dataset and CompletionsDataset instead, adding a text_key constructor argument to the former (and changing it to work with a provided data structure instead of just from a JSON file), and prompt_key and completion_key arguments to the latter with defaults for backwards compatibility.
* nits
* update docs
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Added support for the MiniCPM architecture
* Added support for the MiniCPM architecture
* Updated utils.py and LORA.md
* Updated utils.py and LORA.md
* Update implementation details for MiniCPM architecture
* Cleaning up
* fixed the missing lm.head layer problem
* Refactor Model class to dynamically handle tied and untied word embeddings
* Quick update
* added a dynamic rope scaling base calucaltion
* Added support for the MiniCPM architecture
* Added support for the MiniCPM architecture
* Updated utils.py and LORA.md
* Updated utils.py and LORA.md
* Update implementation details for MiniCPM architecture
* Cleaning up
* fixed the missing lm.head layer problem
* Refactor Model class to dynamically handle tied and untied word embeddings
* added a dynamic rope scaling base calucaltion
* quick fix and clean up
* clean up again
* removed the MiniCPMNorm class as its not used
* forgot something, sorry
* format
* version bump
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* wip
* wip
* feat: convert mlx model to gguf f16
* chore: conver norm layer to float32 to avoid overflow issue
* chore: add support for mixtral
* chore: clean up
* chore: remove unused import statement
* chore: clean up weight name mapping
* version and readme
* actual version bump
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Convert mlx_lm.lora to use YAML configuration
* pre-commit run fixes
* Fix loading of config file
* Remove invalid YAML from doc
* Update command-line options and YAML parameter overriding, per feedback in #503
* Minor wording change
* Positional argument
* Moved config to a (-c/--config) flag
* Removed CLI option defaults (since CLI options take precedence and their defaults are in CONFIG_DEFAULTS)
* pre-commit format updates
* Fix handling of CLI option defaults
* Prevent None values of unspecified CLI options from overwriting values from CONFIG_DEFAULTS
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>