mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-06-25 01:41:19 +08:00

Author	SHA1	Message	Date
Awni Hannun	6f0a69e682	fix lora for openelm (#773 )	2024-05-10 09:51:41 -07:00
Anchen	f30413b63c	chore(mlx-lm): fix the number of validation batches configuration. (#752 ) * chore: fix number of validation batches * clean up * address comment	2024-05-04 06:52:42 -07:00
Awni Hannun	2146bcd7ee	Quantize embedding / Update quantize API (#680 ) * more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test	2024-04-18 18:16:10 -07:00
dmdaksh	7d7e236061	- Removed unused Python imports (#683 ) - bert/model.py:10: tree_unflatten - bert/model.py:2: dataclass - bert/model.py:8: numpy - cifar/resnet.py:6: Any - clip/model.py:15: tree_flatten - clip/model.py:9: Union - gcn/main.py:8: download_cora - gcn/main.py:9: cross_entropy - llms/gguf_llm/models.py:12: tree_flatten, tree_unflatten - llms/gguf_llm/models.py:9: numpy - llms/mixtral/mixtral.py:12: tree_map - llms/mlx_lm/models/dbrx.py:2: Dict, Union - llms/mlx_lm/tuner/trainer.py:5: partial - llms/speculative_decoding/decoder.py:1: dataclass, field - llms/speculative_decoding/decoder.py:2: Optional - llms/speculative_decoding/decoder.py:5: mlx.nn - llms/speculative_decoding/decoder.py:6: numpy - llms/speculative_decoding/main.py:2: glob - llms/speculative_decoding/main.py:3: json - llms/speculative_decoding/main.py:5: Path - llms/speculative_decoding/main.py:8: mlx.nn - llms/speculative_decoding/model.py:6: tree_unflatten - llms/speculative_decoding/model.py:7: AutoTokenizer - llms/tests/test_lora.py:13: yaml_loader - lora/lora.py:14: tree_unflatten - lora/models.py:11: numpy - lora/models.py:3: glob - speechcommands/kwt.py:1: Any - speechcommands/main.py:7: mlx.data - stable_diffusion/stable_diffusion/model_io.py:4: partial - whisper/benchmark.py:5: sys - whisper/test.py:5: subprocess - whisper/whisper/audio.py:6: Optional - whisper/whisper/decoding.py:8: mlx.nn	2024-04-16 07:50:32 -07:00
Awni Hannun	2bd64b78cf	Save lora config (#636 ) * lora config * comments * version bump	2024-04-02 13:52:53 -07:00
Anchen	494cdf8e96	chore: fix loar for moe model (#608 )	2024-03-23 07:22:11 -07:00
madroid	39d5ca6427	LoRA: report last train info (#595 )	2024-03-19 17:29:50 -07:00
madroid	d4e1de1d5b	add peak_memory info to training callback (#572 )	2024-03-13 20:17:10 -07:00
Awni Hannun	39084e81c2	Some improvements to LoRA (#528 ) * set cache_limit * remove set cache_limit * cleanup * add gradient checkpointing * fix sort * mokey patch call for checkpoint * fix example config	2024-03-12 20:02:03 -07:00
Chime Ogbuji	e56d9015ef	LoRA on all linear transformer block layers (#546 ) * Add --lora-all-linear option to apply LoRa to all linear transfer block layers * Moved to YAML config and added specification of rank & alpha * nits in conifg, more tests * nit * run tests for prs --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-12 07:37:40 -07:00
Madroid Ma	f03c8a7b44	LoRA: adapter file Support path information (#505 ) * LoRA: adapter file Support path information * fix pre-commit lint * from os.path to pathlib.Path * Update llms/mlx_lm/tuner/trainer.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * rename check_checkpoints_path to checkpoints_path * fix pre-commit lint --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-02-29 22:20:49 -08:00
Awni Hannun	ab9172baac	Gemma support (#474 ) * gemma support * format * lora support for gemma	2024-02-21 08:47:13 -08:00
Madroid Ma	8eee4399f4	LoRA: Add printing and callbacks for learning rate during training (#457 ) * LoRA：Refactor TrainingCallback to enhance flexibility and extensibility This commit refactors the TrainingCallback class to accept a dictionary parameter for both on_train_loss_report and on_val_loss_report methods. By switching from multiple parameters to a single dict parameter, this change significantly improves the class's flexibility and makes it easier to extend with new training or validation metrics in the future without altering the method signatures. This approach simplifies the addition of new information to be logged or processed and aligns with best practices for scalable and maintainable code design. * LoRA: Add printing and callbacks for learning rate during training	2024-02-20 13:07:21 -08:00
Awni Hannun	e4d5630698	Basic CircleCI (#449 ) * basic style checks for circleci * format * fix config	2024-02-16 22:13:55 -08:00
Madroid Ma	0ba466369f	LoRA: add training callbacks (#414 ) * LoRA: add training callbacks * LoRA: add trained tokens print & callback	2024-02-16 06:04:57 -08:00
Chime Ogbuji	e446598f62	Passing parameterized loss and batching to trainer (#391 )	2024-02-13 07:03:25 -08:00
Madroid Ma	954aa50c54	LoRA: Improve validation error for LoRA layer count exceeding model layer (#427 ) * LoRA: Improve validation error for LoRA layer count exceeding model layer This commit enhances the error handling when the specified LoRA layer count exceeds the total number of layers in the model. It clarifies the error message to provide actionable feedback for users, guiding them to adjust their input parameters accordingly. * format + nits --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-02-13 06:56:27 -08:00
Ivan Fioravanti	4576946151	Add checkpoints directory for adapter weights (#431 ) * Add checkpoints directory for adapter weights The code was modified to create a checkpoints directory if it doesn't exist yet. Adapter weights are now saved to this checkpoints directory during the training iterations. Corrected indentation of Save adapter weights code because it was part of "if eval" * Fixing a blank added by mistake	2024-02-12 10:50:05 -08:00
Nripesh Niketan	f1ef378a58	Feat: update pre-commit rev (#432 )	2024-02-11 07:23:27 -08:00
Anchen	f51e98fcf1	chore(mlx-lm): truncate the input sentence to max seq len in lora iterate_batches (#373 ) * chore(mlx-lm): pass max seq len to evaluate in training loop * chore: make sure the batch seq not exceed max len * chore: update comment * chore: add warning before truncate input	2024-01-25 12:38:04 -08:00
Anchen	362e88a744	feat: move lora into mlx-lm (#337 ) * feat: Add lora and qlora training to mlx-lm --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-23 08:44:37 -08:00

21 Commits