Commit Graph

68 Commits

Author SHA1 Message Date
Madroid Ma
f03c8a7b44 LoRA: adapter file Support path information (#505)
* LoRA: adapter file Support path information

* fix pre-commit lint

* from os.path to pathlib.Path

* Update llms/mlx_lm/tuner/trainer.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* rename check_checkpoints_path to checkpoints_path

* fix pre-commit lint

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-02-29 22:20:49 -08:00
Awni Hannun
ab9172baac Gemma support (#474)
* gemma support

* format

* lora support for gemma
2024-02-21 08:47:13 -08:00
Madroid Ma
8eee4399f4 LoRA: Add printing and callbacks for learning rate during training (#457)
* LoRA:Refactor TrainingCallback to enhance flexibility and extensibility

This commit refactors the TrainingCallback class to accept a dictionary parameter for both on_train_loss_report and on_val_loss_report methods. By switching from multiple parameters to a single dict parameter, this change significantly improves the class's flexibility and makes it easier to extend with new training or validation metrics in the future without altering the method signatures. This approach simplifies the addition of new information to be logged or processed and aligns with best practices for scalable and maintainable code design.

* LoRA: Add printing and callbacks for learning rate during training
2024-02-20 13:07:21 -08:00
Awni Hannun
e4d5630698 Basic CircleCI (#449)
* basic style checks for circleci

* format

* fix config
2024-02-16 22:13:55 -08:00
Madroid Ma
0ba466369f LoRA: add training callbacks (#414)
* LoRA: add training callbacks

* LoRA: add trained tokens print & callback
2024-02-16 06:04:57 -08:00
Madroid Ma
726b1ddec0 fix: check LoRA layers number error (#446) 2024-02-16 06:03:33 -08:00
Chime Ogbuji
e446598f62 Passing parameterized loss and batching to trainer (#391) 2024-02-13 07:03:25 -08:00
Madroid Ma
954aa50c54 LoRA: Improve validation error for LoRA layer count exceeding model layer (#427)
* LoRA: Improve validation error for LoRA layer count exceeding model layer

This commit enhances the error handling when the specified LoRA layer count exceeds the total number of layers in the model. It clarifies the error message to provide actionable feedback for users, guiding them to adjust their input parameters accordingly.

* format + nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-02-13 06:56:27 -08:00
Awni Hannun
d4666615bb Lazy import + refactor Lora layer addition (#426)
* lazy model import in mlx_lm

* change lora loading

* fix olmo lora

* remove a bunch of unused stuff from plamo

* move phixtral to mlx-lm and out of llms/
2024-02-12 10:51:02 -08:00
Ivan Fioravanti
4576946151 Add checkpoints directory for adapter weights (#431)
* Add checkpoints directory for adapter weights

The code was modified to create a checkpoints directory if it doesn't exist yet. Adapter weights are now saved to this checkpoints directory during the training iterations.
Corrected indentation of Save adapter weights code because it was part of "if eval"

* Fixing a blank added by mistake
2024-02-12 10:50:05 -08:00
Nripesh Niketan
f1ef378a58 Feat: update pre-commit rev (#432) 2024-02-11 07:23:27 -08:00
Anchen
0a49ba0697 fix(mlx-lm): apply lora layer doesn't update the lora weights (#396) 2024-01-31 11:51:26 -08:00
Anchen
614de6652f chore(mlx-lm): add reset lora layers helper (#377)
* chore(mlx-lm): add reset lora layers helper

* chore: rename the func

* chore: update docstring

* Update llms/mlx_lm/tuner/utils.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-01-29 20:54:49 -08:00
Anchen
854ad8747a feat(mlx-lm): add de-quant for fuse.py (#365)
* feat(mlx-lm): add de-quant for fuse

* chore: disable quant in to linear when de-quant enabled

* chore: add better error handling for adapter file not found
2024-01-25 18:59:32 -08:00
Anchen
f51e98fcf1 chore(mlx-lm): truncate the input sentence to max seq len in lora iterate_batches (#373)
* chore(mlx-lm): pass max seq len to evaluate in training loop

* chore: make sure the batch seq not exceed max len

* chore: update comment

* chore: add warning before truncate input
2024-01-25 12:38:04 -08:00
Anchen
b1dec281b3 feat(mlx-lm): add lora hypeparameters in lora layer (#366)
* feat(mlx-lm): add lora hypeparameters in lora layer

* chore: address comments
2024-01-24 08:11:25 -08:00
Anchen
ab91ac1075 chore(mlx-lm): add load model with adapter and fix bug in sample (#360)
* chore: add load model with adapter support and fix bug in sample

* chore: ignore temp during calculating prob in sample
2024-01-23 19:47:39 -08:00
Anchen
362e88a744 feat: move lora into mlx-lm (#337)
* feat: Add lora and qlora training to mlx-lm


---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 08:44:37 -08:00