mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Author	SHA1	Message	Date
Awni Hannun	69181e0058	Support non incremental kv cache growth (#766 )	2024-05-15 12:56:24 -07:00
Awni Hannun	fad9598372	Fix llama cache check (#763 ) * fix llama cache check * add test	2024-05-08 08:35:54 -07:00
Awni Hannun	ee60e2a9d5	Kv cache (#643 ) * in place kv_cache * fix * fix kv cache size * partially fix kv cache dtype * step kv cache * multiple of step size * more teests + kv cache * more kv cache * udpate all models to use kv cache	2024-05-08 08:18:13 -07:00
Anchen	f30413b63c	chore(mlx-lm): fix the number of validation batches configuration. (#752 ) * chore: fix number of validation batches * clean up * address comment	2024-05-04 06:52:42 -07:00
Prince Canuma	abcd891851	Add support for phi-3 (#712 ) * Add phi-3 modelling * fix rope scaling warning * add tests and update tuner utils * update name and remove sanitize * fix lora	2024-04-23 09:20:00 -07:00
Awni Hannun	2146bcd7ee	Quantize embedding / Update quantize API (#680 ) * more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test	2024-04-18 18:16:10 -07:00
Anchen	f5f189e48a	fix(mlx-lm): broken server.py (#690 ) * fix server.py * fix var referenced before assignment * add test * clean up	2024-04-18 14:26:18 -07:00
dmdaksh	7d7e236061	- Removed unused Python imports (#683 ) - bert/model.py:10: tree_unflatten - bert/model.py:2: dataclass - bert/model.py:8: numpy - cifar/resnet.py:6: Any - clip/model.py:15: tree_flatten - clip/model.py:9: Union - gcn/main.py:8: download_cora - gcn/main.py:9: cross_entropy - llms/gguf_llm/models.py:12: tree_flatten, tree_unflatten - llms/gguf_llm/models.py:9: numpy - llms/mixtral/mixtral.py:12: tree_map - llms/mlx_lm/models/dbrx.py:2: Dict, Union - llms/mlx_lm/tuner/trainer.py:5: partial - llms/speculative_decoding/decoder.py:1: dataclass, field - llms/speculative_decoding/decoder.py:2: Optional - llms/speculative_decoding/decoder.py:5: mlx.nn - llms/speculative_decoding/decoder.py:6: numpy - llms/speculative_decoding/main.py:2: glob - llms/speculative_decoding/main.py:3: json - llms/speculative_decoding/main.py:5: Path - llms/speculative_decoding/main.py:8: mlx.nn - llms/speculative_decoding/model.py:6: tree_unflatten - llms/speculative_decoding/model.py:7: AutoTokenizer - llms/tests/test_lora.py:13: yaml_loader - lora/lora.py:14: tree_unflatten - lora/models.py:11: numpy - lora/models.py:3: glob - speechcommands/kwt.py:1: Any - speechcommands/main.py:7: mlx.data - stable_diffusion/stable_diffusion/model_io.py:4: partial - whisper/benchmark.py:5: sys - whisper/test.py:5: subprocess - whisper/whisper/audio.py:6: Optional - whisper/whisper/decoding.py:8: mlx.nn	2024-04-16 07:50:32 -07:00
Awni Hannun	c68aa3c7c3	Stable lm 2 (#666 ) * stable lm 2 * test and lora * version bump * merge stable models	2024-04-08 14:18:55 -07:00
Prince Canuma	d661440dbb	Add support for qwen2moe (#640 ) * add sparsemoe block and update decoder logic * update file name to match HF * update name * Code formatting * update gates calculation * add support for Qwen2MoE. * fix pytest * code formatting and fix missing comma in utils * Remove decoder sparse step. Co-authored-by: bozheng-hit <dsoul0621@gmail.com> * remove gate layer anti-quantisation * remove unused argument --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com>	2024-04-02 11:33:29 -07:00
Chime Ogbuji	f6283ef7ce	Configurable LR schedulers (#604 ) * Initial config handler and test * Added means to run from CLI * Update lora config loading and tests * Constrain scheduler config (warmup and minimum LR) for each kind * Update reference to moved schedule_config module * Minor fix * Fix typos * Moved build_schedule and tests * nits in schedule config * flake * fix path --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-29 13:41:10 -07:00
Anchen	297a908e3d	fix(mlx-lm): type hints in gguf.py (#621 )	2024-03-26 07:56:01 -07:00
Anchen	0ab01b4626	fix(mlx-lm): sorted probs in top_p implementation. (#610 ) * fix(mlx-lm): the top p imp * chore: address comment	2024-03-25 15:07:55 -07:00
Awni Hannun	b8a348c1b8	Switch to fast RMS/LN Norm (#603 ) * use nn.RMSNorm, use sdpa, cleanup * bump mlx versions * minor update * use fast layer norm * version bump * update requirement for whisper * update requirement for gguf	2024-03-23 07:13:51 -07:00
Anchen	fbed720d6f	chore(mlx-lm): fix the top_p implementation. (#602 ) * chore(mlx-lm): clean up the top p imp * chore: clean up * chore: add test * chore: address comments * chore: clean up docs string * chore: clean up test	2024-03-21 12:18:23 -07:00
Anchen	949f63f309	chore(mlx-lm): fix print_trainable_parameters for quant models (#581 ) * chore(mlx-lm): fix print_trainable_parameters for quant models * chore: clean up * refactor: use layer type to check quant bits * chore: address comment	2024-03-20 08:41:03 -07:00
madroid	b0bcd86a40	Support for OpenAI’s fine-tuning dataset format (#548 ) * LoRA: move load_dataset to tuner/datasets.py file * LoRA: support OpenAI chat format datasets see https://platform.openai.com/docs/guides/fine-tuning/example-format * LoRA: support OpenAI completion format datasets * LoRA: formatting dataset timing to reduce memory footprint * Refactor dataset item access in PromptCompletionDataset * Update mlx_lm/LORA.md * Update mlx_lm/LORA.md * check Unsupported data format * add tests, fine-tune doc * add tests, fine-tune doc * add jinja2 for chat template * nits in readme * nits in readme --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-19 16:45:46 -07:00
Prince Canuma	76c3244cc5	Add support for Cohere's Command-R (#565 ) * initial commit for command-R * update mlp, layernorm, lm_head and model args * add custom layernorm * add default to tie_word_embeddings * add layernorm weight type and refactor * update layernorm (bias conditional) in model/layers * fix layer norm use traditional rope * add test --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-13 07:03:36 -07:00
Anchen	3535408c99	chore(mlx-lm): fix tie_word_embeddings for qwen2 (#566 ) * chore: fix tie_word_embeddings for qwen2 * chore: default tie_word_embeddings to True	2024-03-12 21:34:32 -07:00
Chime Ogbuji	e56d9015ef	LoRA on all linear transformer block layers (#546 ) * Add --lora-all-linear option to apply LoRa to all linear transfer block layers * Moved to YAML config and added specification of rank & alpha * nits in conifg, more tests * nit * run tests for prs --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-12 07:37:40 -07:00
Awni Hannun	7cdd1b69ac	Enable unit testing in Circle and start some MLX LM tests (#545 ) * add a few tests for mlx lm * add a few tests for mlx lm * add a few tests for mlx lm * more tests / cleanup	2024-03-07 09:31:57 -08:00

21 Commits