mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-06-25 01:41:19 +08:00

Author	SHA1	Message	Date
Awni Hannun	7be292c0c9	Handle longer prompt/generation (#931 ) * rebase * nits * nit * fix rotating cache with step prefill * update version	2024-08-16 15:28:39 -07:00
Chime Ogbuji	df6bc09d74	Configuration-based use of HF hub-hosted datasets for training (#701 ) * Add hf_dataset configuration for using HF hub-hosted datasets for (Q)LoRA training * Pre-commit formatting * Fix YAML config example * Print DS info * Include name * Add hf_dataset parameter default * Remove TextHFDataset and CompletionsHFDataset and use Dataset and CompletionsDataset instead, adding a text_key constructor argument to the former (and changing it to work with a provided data structure instead of just from a JSON file), and prompt_key and completion_key arguments to the latter with defaults for backwards compatibility. * nits * update docs --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-06-26 10:20:50 -07:00
Awni Hannun	d8b073e3a7	Add eos token to lora fine-tunes (#818 ) * add eos token to lora fine-tunes * Comment	2024-06-12 07:44:21 -07:00
madroid	c457a3f88b	LoRA: Extract small function (#614 ) * LoRA: Extract pre_processing_model function * LoRA: Extract small functions(train_model,evaluate_model) * move test case to test_tuner_utils.py * nits * nits * remove extra param, validate at it 0 * version * fix test --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-06-02 06:38:42 -07:00
Awni Hannun	81318ad4a8	Port of phi3small (#794 ) * start port of phi3small * fix phi3 * use block sparsity * compile activation * nits in readme / mlx lm version	2024-05-31 12:54:14 -07:00
Awni Hannun	9fc6efbd90	version bump + some fixes (#792 )	2024-05-21 20:09:35 -07:00
alexC-nonsense4k	42458914c8	support dora finetune in mlx-examples/llms/mlx_lm (#779 ) * support dora finetune * solve problems in lora.py and tuner.utils.py * add use_dora (bool) in functions of load adapters * delete all unsupported quantization code and fix all the calculate problems in mlx_lm/tuner/dora.py * Using stop_gradient to prevent gradients from flowing through ‘norm’ during backpropagation * set DEFAULT_USE_DORA in mlx_lm/generate.py * add annotation for all the use_dora * mlx_lm/fuse.py support fuse dora layers and fix a bug of to_linear() in mlx_lm/tuner/dora.py * simplify code of juding type of a fused layer in mlx_lm/fuse.py * add use_dora in mlx_lm/fuse.py when apply_lora_layers() * style + nits * style + nits * more updates --------- Co-authored-by: chenyifei08 <chenyifei08@baidu.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-05-16 08:21:26 -07:00
Awni Hannun	ee60e2a9d5	Kv cache (#643 ) * in place kv_cache * fix * fix kv cache size * partially fix kv cache dtype * step kv cache * multiple of step size * more teests + kv cache * more kv cache * udpate all models to use kv cache	2024-05-08 08:18:13 -07:00
Gökdeniz Gülmez	2c1c9e9024	MiniCPM implementation (#685 ) * Added support for the MiniCPM architecture * Added support for the MiniCPM architecture * Updated utils.py and LORA.md * Updated utils.py and LORA.md * Update implementation details for MiniCPM architecture * Cleaning up * fixed the missing lm.head layer problem * Refactor Model class to dynamically handle tied and untied word embeddings * Quick update * added a dynamic rope scaling base calucaltion * Added support for the MiniCPM architecture * Added support for the MiniCPM architecture * Updated utils.py and LORA.md * Updated utils.py and LORA.md * Update implementation details for MiniCPM architecture * Cleaning up * fixed the missing lm.head layer problem * Refactor Model class to dynamically handle tied and untied word embeddings * added a dynamic rope scaling base calucaltion * quick fix and clean up * clean up again * removed the MiniCPMNorm class as its not used * forgot something, sorry * format * version bump --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-04-25 15:29:28 -07:00
Awni Hannun	2146bcd7ee	Quantize embedding / Update quantize API (#680 ) * more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test	2024-04-18 18:16:10 -07:00
Awni Hannun	9c5554d8ee	Use async eval (#670 ) * Use async eval * bump * bump * remove workaround for bfloat cumsum	2024-04-11 13:18:23 -07:00
Awni Hannun	c68aa3c7c3	Stable lm 2 (#666 ) * stable lm 2 * test and lora * version bump * merge stable models	2024-04-08 14:18:55 -07:00
Awni Hannun	c386dd5f5a	Fix for cohere plus (#650 ) * fix for cohere plus * version bump	2024-04-05 14:11:24 -07:00
Awni Hannun	2bd64b78cf	Save lora config (#636 ) * lora config * comments * version bump	2024-04-02 13:52:53 -07:00
Anchen	fe96ef342f	feat(mlx-lm): export the GGUF (fp16) format model weights from fuse.py (#555 ) * wip * wip * feat: convert mlx model to gguf f16 * chore: conver norm layer to float32 to avoid overflow issue * chore: add support for mixtral * chore: clean up * chore: remove unused import statement * chore: clean up weight name mapping * version and readme * actual version bump --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-21 10:34:11 -07:00
Awni Hannun	14fe868825	version (#570 )	2024-03-13 10:09:36 -07:00
Awni Hannun	8b05bb6d18	[mlx-lm] Use sdpa in llama / mistral model (#515 ) * use sdpa * update a few more models * version * fix stablelm type	2024-03-07 17:41:23 -08:00
Awni Hannun	95f82e67a2	Fix import warning (#479 ) * fix import warning * fix version import * remove api, move convert to utils * also update circle to run external PRs	2024-02-27 08:47:56 -08:00

18 Commits