Commit Graph

12 Commits

Author SHA1 Message Date
Chime Ogbuji
e56d9015ef
LoRA on all linear transformer block layers (#546)
* Add --lora-all-linear option to apply LoRa to all linear transfer block layers

* Moved to YAML config and added specification of rank & alpha

* nits in conifg, more tests

* nit

* run tests for prs

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-12 07:37:40 -07:00
Chime Ogbuji
8c2cf665ed
YAML configuration for mlx_lm.lora (#503)
* Convert mlx_lm.lora to use YAML configuration

* pre-commit run fixes

* Fix loading of config file

* Remove invalid YAML from doc

* Update command-line options and YAML parameter overriding, per feedback in #503

* Minor wording change

* Positional argument

* Moved config to a (-c/--config) flag

* Removed CLI option defaults (since CLI options take precedence and their defaults are in CONFIG_DEFAULTS)

* pre-commit format updates

* Fix handling of CLI option defaults

* Prevent None values of unspecified CLI options from overwriting values from CONFIG_DEFAULTS

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-08 07:57:52 -08:00
Anchen
13794a05da
chore(mlx-lm): add adapter support in generate.py (#494)
* chore(mlx-lm): add adapter support in generate.py

* chore: remove generate from lora.py and raise error to let user use mlx_lm.generate instead
2024-02-28 07:49:25 -08:00
Madroid Ma
e5dfef5d9a
LoRA: Extract the run function for easy use in scripts file (#482)
* LoRA: Extract the run_lora function for easy use in scripts

* LoRA: run_lora function adds a TrainingCallback pass.

* LoRA: change run_lora to run
2024-02-26 19:35:04 -08:00
Ivan Fioravanti
b05907c87e
Change argument name in lora.py (#453)
The argument name "--max_seq_length" was updated to "--max-seq-length" in the code to maintain a consistent naming convention across the program.
2024-02-18 06:04:49 -08:00
Madroid Ma
0ba466369f
LoRA: add training callbacks (#414)
* LoRA: add training callbacks

* LoRA: add trained tokens print & callback
2024-02-16 06:04:57 -08:00
Awni Hannun
d4666615bb
Lazy import + refactor Lora layer addition (#426)
* lazy model import in mlx_lm

* change lora loading

* fix olmo lora

* remove a bunch of unused stuff from plamo

* move phixtral to mlx-lm and out of llms/
2024-02-12 10:51:02 -08:00
Awni Hannun
aa7447efa2
Olmo in MLX LM (#415)
* run olmo

* format
2024-02-05 21:13:49 -08:00
Ivan Fioravanti
7fbca214b1
Add max sequence length argument in lora.py (#408)
A new argument "--max_seq_length" has been added to the command-line parser and passed as a parameter to the main function of the lora.py script. This allows users to specify and control the maximum sequence length during training.
2024-02-04 12:28:21 -08:00
Madroid Ma
ba3a9355d1
LoRA: Remove unnecessary model type judgments (#388)
* LoRA: Remove unnecessary model type judgments

1. Supported models are already checked in the load_model function in utils, no need to repeat the check in lora
2. The checks in lora are not synchronized with those in utils

* LoRA: add LoRA supported models in mlx_lm utils
2024-01-31 11:55:27 -08:00
Anchen
b1dec281b3
feat(mlx-lm): add lora hypeparameters in lora layer (#366)
* feat(mlx-lm): add lora hypeparameters in lora layer

* chore: address comments
2024-01-24 08:11:25 -08:00
Anchen
362e88a744
feat: move lora into mlx-lm (#337)
* feat: Add lora and qlora training to mlx-lm


---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 08:44:37 -08:00