Anchen
f51e98fcf1
chore(mlx-lm): truncate the input sentence to max seq len in lora iterate_batches ( #373 )
...
* chore(mlx-lm): pass max seq len to evaluate in training loop
* chore: make sure the batch seq not exceed max len
* chore: update comment
* chore: add warning before truncate input
2024-01-25 12:38:04 -08:00
Anchen
b1dec281b3
feat(mlx-lm): add lora hypeparameters in lora layer ( #366 )
...
* feat(mlx-lm): add lora hypeparameters in lora layer
* chore: address comments
2024-01-24 08:11:25 -08:00
Anchen
ab91ac1075
chore(mlx-lm): add load model with adapter and fix bug in sample ( #360 )
...
* chore: add load model with adapter support and fix bug in sample
* chore: ignore temp during calculating prob in sample
2024-01-23 19:47:39 -08:00
Anchen
362e88a744
feat: move lora into mlx-lm ( #337 )
...
* feat: Add lora and qlora training to mlx-lm
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 08:44:37 -08:00