Commit Graph

20 Commits

Author SHA1 Message Date
Awni Hannun
2bd64b78cf
Save lora config (#636)
* lora config

* comments

* version bump
2024-04-02 13:52:53 -07:00
Chime Ogbuji
f6283ef7ce
Configurable LR schedulers (#604)
* Initial config handler and test

* Added means to run from CLI

* Update lora config loading and tests

* Constrain scheduler config (warmup and minimum LR) for each kind

* Update reference to moved schedule_config module

* Minor fix

* Fix typos

* Moved build_schedule and tests

* nits in schedule config

* flake

* fix path

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-29 13:41:10 -07:00
Ivan Fioravanti
d2a99172a6
Add dropout parameter to lora configuration (#599)
* Add dropout parameter to lora configuration

A dropout parameter has been added to the lora configuration settings in lora_config.yaml. The LoRALinear class in utils.py has been updated to take this new parameter. Additionally, a AttributeError: 'types.SimpleNamespace' object has no attribute 'prompt' related to `args.prompt` has been removed from lora.py.

* Update lora_config.yaml

Set dropout to 0.0 in the sample config file

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-20 08:44:40 -07:00
Anchen
949f63f309
chore(mlx-lm): fix print_trainable_parameters for quant models (#581)
* chore(mlx-lm): fix print_trainable_parameters for quant models

* chore: clean up

* refactor: use layer type to check quant bits

* chore: address comment
2024-03-20 08:41:03 -07:00
madroid
b0bcd86a40
Support for OpenAI’s fine-tuning dataset format (#548)
* LoRA: move load_dataset to tuner/datasets.py file

* LoRA: support OpenAI chat format datasets

see https://platform.openai.com/docs/guides/fine-tuning/example-format

* LoRA: support OpenAI completion format datasets

* LoRA: formatting dataset timing to reduce memory footprint

* Refactor dataset item access in PromptCompletionDataset

* Update mlx_lm/LORA.md

* Update mlx_lm/LORA.md

* check Unsupported data format

* add tests, fine-tune doc

* add tests, fine-tune doc

* add jinja2 for chat template

* nits in readme

* nits in readme

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-19 16:45:46 -07:00
Awni Hannun
e4b19bb9e1
Make attention faster for a some models (#574)
* make attention faster for a couple models

* remove unused generation flags

* add comment on lora

* include text files as well
2024-03-14 21:35:54 -07:00
madroid
485180ae91
LoRA: some minor optimizations (#573)
* init training_args in training scope

* Add trainable parameters percentage
2024-03-13 20:26:30 -07:00
Awni Hannun
39084e81c2
Some improvements to LoRA (#528)
* set cache_limit

* remove set cache_limit

* cleanup

* add gradient checkpointing

* fix sort

* mokey patch call for checkpoint

* fix example config
2024-03-12 20:02:03 -07:00
Chime Ogbuji
e56d9015ef
LoRA on all linear transformer block layers (#546)
* Add --lora-all-linear option to apply LoRa to all linear transfer block layers

* Moved to YAML config and added specification of rank & alpha

* nits in conifg, more tests

* nit

* run tests for prs

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-12 07:37:40 -07:00
Chime Ogbuji
8c2cf665ed
YAML configuration for mlx_lm.lora (#503)
* Convert mlx_lm.lora to use YAML configuration

* pre-commit run fixes

* Fix loading of config file

* Remove invalid YAML from doc

* Update command-line options and YAML parameter overriding, per feedback in #503

* Minor wording change

* Positional argument

* Moved config to a (-c/--config) flag

* Removed CLI option defaults (since CLI options take precedence and their defaults are in CONFIG_DEFAULTS)

* pre-commit format updates

* Fix handling of CLI option defaults

* Prevent None values of unspecified CLI options from overwriting values from CONFIG_DEFAULTS

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-08 07:57:52 -08:00
Anchen
13794a05da
chore(mlx-lm): add adapter support in generate.py (#494)
* chore(mlx-lm): add adapter support in generate.py

* chore: remove generate from lora.py and raise error to let user use mlx_lm.generate instead
2024-02-28 07:49:25 -08:00
Madroid Ma
e5dfef5d9a
LoRA: Extract the run function for easy use in scripts file (#482)
* LoRA: Extract the run_lora function for easy use in scripts

* LoRA: run_lora function adds a TrainingCallback pass.

* LoRA: change run_lora to run
2024-02-26 19:35:04 -08:00
Ivan Fioravanti
b05907c87e
Change argument name in lora.py (#453)
The argument name "--max_seq_length" was updated to "--max-seq-length" in the code to maintain a consistent naming convention across the program.
2024-02-18 06:04:49 -08:00
Madroid Ma
0ba466369f
LoRA: add training callbacks (#414)
* LoRA: add training callbacks

* LoRA: add trained tokens print & callback
2024-02-16 06:04:57 -08:00
Awni Hannun
d4666615bb
Lazy import + refactor Lora layer addition (#426)
* lazy model import in mlx_lm

* change lora loading

* fix olmo lora

* remove a bunch of unused stuff from plamo

* move phixtral to mlx-lm and out of llms/
2024-02-12 10:51:02 -08:00
Awni Hannun
aa7447efa2
Olmo in MLX LM (#415)
* run olmo

* format
2024-02-05 21:13:49 -08:00
Ivan Fioravanti
7fbca214b1
Add max sequence length argument in lora.py (#408)
A new argument "--max_seq_length" has been added to the command-line parser and passed as a parameter to the main function of the lora.py script. This allows users to specify and control the maximum sequence length during training.
2024-02-04 12:28:21 -08:00
Madroid Ma
ba3a9355d1
LoRA: Remove unnecessary model type judgments (#388)
* LoRA: Remove unnecessary model type judgments

1. Supported models are already checked in the load_model function in utils, no need to repeat the check in lora
2. The checks in lora are not synchronized with those in utils

* LoRA: add LoRA supported models in mlx_lm utils
2024-01-31 11:55:27 -08:00
Anchen
b1dec281b3
feat(mlx-lm): add lora hypeparameters in lora layer (#366)
* feat(mlx-lm): add lora hypeparameters in lora layer

* chore: address comments
2024-01-24 08:11:25 -08:00
Anchen
362e88a744
feat: move lora into mlx-lm (#337)
* feat: Add lora and qlora training to mlx-lm


---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 08:44:37 -08:00