mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-08-09 18:36:38 +08:00

Author	SHA1	Message	Date
Awni Hannun	d4666615bb	Lazy import + refactor Lora layer addition (#426 ) * lazy model import in mlx_lm * change lora loading * fix olmo lora * remove a bunch of unused stuff from plamo * move phixtral to mlx-lm and out of llms/	2024-02-12 10:51:02 -08:00
Ivan Fioravanti	4576946151	Add checkpoints directory for adapter weights (#431 ) * Add checkpoints directory for adapter weights The code was modified to create a checkpoints directory if it doesn't exist yet. Adapter weights are now saved to this checkpoints directory during the training iterations. Corrected indentation of Save adapter weights code because it was part of "if eval" * Fixing a blank added by mistake	2024-02-12 10:50:05 -08:00
Nripesh Niketan	f1ef378a58	Feat: update pre-commit rev (#432 )	2024-02-11 07:23:27 -08:00
Awni Hannun	f45a1ab83c	Update a few examples to use compile (#420 ) * update a few examples to use compile * update mnist * add compile to vae and rename some stuff for simplicity * update reqs * use state in eval * GCN example with RNG + dropout * add a bit of prefetching	2024-02-08 13:00:41 -08:00
Anchen	da7adae5ec	fix(mlx-m): lazy load hf_olmo (#424 )	2024-02-08 09:02:43 -08:00
Markus Enzweiler	9b387007ab	Example of a Convolutional Variational Autoencoder (CVAE) on MNIST (#264 ) * initial commit * style fixes * update of ACKNOWLEDGMENTS * fixed comment * minor refactoring; removed unused imports * added cifar and cvae to top-level README.md * removed mention of cuda/mps in argparse * fixed training status output * load_weights() with strict=True * pretrained model update * fixed imports and style * requires mlx>=0.0.9 * updated with results using mlx 0.0.9 * removed mention of private repo * simplify and combine to one file, more consistency with other exmaples * few more nits * nits * spell * format --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-02-06 20:02:27 -08:00
Long Sha	8071aacd98	fix-mistral-download-link (#418 )	2024-02-06 19:56:56 -08:00
Chris McMaster	2303238e44	Update olmo.py (#419 ) exit should be imported outside of interactive mode	2024-02-06 16:16:46 -08:00
Anchen	8b77677c05	chore(mlx-lm): add model weight index in save_weights (#413 ) * chore(mlx-lm): add model weight index in save_weights * Update llms/mlx_lm/utils.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update llms/mlx_lm/utils.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * chore: save total siZe as param size isntead of file size * chore: clean up format --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-02-06 05:32:15 -08:00
Anchen	a7d139f484	fix(mlx-lm): olmo 1b model (#417 )	2024-02-06 05:27:05 -08:00
Awni Hannun	aa7447efa2	Olmo in MLX LM (#415 ) * run olmo * format	2024-02-05 21:13:49 -08:00
Ivan Fioravanti	7fbca214b1	Add max sequence length argument in lora.py (#408 ) A new argument "--max_seq_length" has been added to the command-line parser and passed as a parameter to the main function of the lora.py script. This allows users to specify and control the maximum sequence length during training.	2024-02-04 12:28:21 -08:00
Junyang Lin	9d0dd34403	add qwen2 (#411 )	2024-02-04 08:31:38 -08:00
Madroid Ma	ba3a9355d1	LoRA: Remove unnecessary model type judgments (#388 ) * LoRA: Remove unnecessary model type judgments 1. Supported models are already checked in the load_model function in utils, no need to repeat the check in lora 2. The checks in lora are not synchronized with those in utils * LoRA: add LoRA supported models in mlx_lm utils	2024-01-31 11:55:27 -08:00
Anchen	0a49ba0697	fix(mlx-lm): apply lora layer doesn't update the lora weights (#396 )	2024-01-31 11:51:26 -08:00
Sugato Ray	ab8bde1590	Add `py.typed` to support PEP-561 (type-hinting) (#389 ) This adds support for type-hinting information as laid in [PEP-561](https://peps.python.org/pep-0561/).	2024-01-30 21:17:38 -08:00
David Koski	f8fadf7a17	Fix token count computation to fix tps measurements (#392 )	2024-01-30 11:24:16 -08:00
Anchen	614de6652f	chore(mlx-lm): add reset lora layers helper (#377 ) * chore(mlx-lm): add reset lora layers helper * chore: rename the func * chore: update docstring * Update llms/mlx_lm/tuner/utils.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-01-29 20:54:49 -08:00
Ashish	20b969b412	Replace time.time() with time.perf_counter() as it is more suited for benchmarking (#380 )	2024-01-26 14:11:38 -08:00
Awni Hannun	5aa652d3c2	remove simplify (#379 )	2024-01-26 13:54:49 -08:00
Ashish	0b57f0eae6	Add StableLM-2 1.6B (#378 ) * init * stablelm * add to readme * bump version --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-26 10:28:00 -08:00
Anchen	854ad8747a	feat(mlx-lm): add de-quant for fuse.py (#365 ) * feat(mlx-lm): add de-quant for fuse * chore: disable quant in to linear when de-quant enabled * chore: add better error handling for adapter file not found	2024-01-25 18:59:32 -08:00
Anchen	f51e98fcf1	chore(mlx-lm): truncate the input sentence to max seq len in lora iterate_batches (#373 ) * chore(mlx-lm): pass max seq len to evaluate in training loop * chore: make sure the batch seq not exceed max len * chore: update comment * chore: add warning before truncate input	2024-01-25 12:38:04 -08:00
Anchen	b1dec281b3	feat(mlx-lm): add lora hypeparameters in lora layer (#366 ) * feat(mlx-lm): add lora hypeparameters in lora layer * chore: address comments	2024-01-24 08:11:25 -08:00
Anchen	5fc8668a53	fix(mlx-lm): handle legacy quant models (#369 )	2024-01-24 07:44:05 -08:00
Anchen	ab91ac1075	chore(mlx-lm): add load model with adapter and fix bug in sample (#360 ) * chore: add load model with adapter support and fix bug in sample * chore: ignore temp during calculating prob in sample	2024-01-23 19:47:39 -08:00
Juarez Bochi	f5b80c95fb	Example reading directly from gguf file (#222 ) * Draft of tiny llama from gguf * Transpose all * No transposition with new layout * Read config from gguf * Create tokenizer from gguf * move gguf and update to be similar to hf_llm * change model to HF style + updates to REAMDE * nits in REAMDE * nit readme * only use mlx for metadata * fix eos/bos tokenizer * fix tokenization * quantization runs * 8-bit works * tokenizer fix * bump mlx version --------- Co-authored-by: Juarez Bochi <juarez.bochi@grammarly.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-23 15:41:54 -08:00
iLoveBug	40b61c1719	fix the chinese character generation as same as PR #321 (#342 ) * fix the chinese character generation as same as PR #321 * reuse the generate logic to utils.py * format * verbose defualt * fix conflicst with colorize and character check --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-23 12:44:23 -08:00
Awni Hannun	21aa8038fb	MLX LM version bump (#358 ) * version bump * include new package	2024-01-23 09:05:57 -08:00
Anchen	362e88a744	feat: move lora into mlx-lm (#337 ) * feat: Add lora and qlora training to mlx-lm --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-23 08:44:37 -08:00
Shunta Saito	85c1ff8fd6	Add PLaMo-13B model as an LLM example (#303 ) * Convert HF weights of PLaMo and load it to a plamo model in mlx * Fix model inference part * Add bos at the beginning of the prompt * Fix convert.py to copy tokenizer.model into the converted dir * Use the required insturction format in generate.py when "--instruct" option is specified * Change filenames and update existing scripts * Add README * Add requirements.txt * Fix plamo.py to stop generation when EOS appears * Add quantization to convert.py * Use mlx>=0.0.9 for mx.core.outer() in PLaMo model * Update acknowledgements.md * Fix card text in upload_to_hub() * Not use prompt template when --instruct is not specified * Ask if you trust_remote_code for loading tokenizer of PLaMo * Check the user trusts the remote code when converting * Remove plamo directory * Update README * Add PLaMo model file * Fix the handling of cache in PLaMo and update README * Ask if trust_remote_code only when the model is PLaMo * Remove resolve_trust_remote_code from convert.py and use the latest transformers * Remove code not to add EOS * Update README to fix an example not to use noncommercial version of the model * Remove unused imports * Remove unnecessary description about the instruct model of PLaMo from README * format, nits in README * typo --------- Co-authored-by: Shunta Saito <shunta@mitmul-mbp.local> Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-23 07:17:24 -08:00
Ivan Fioravanti	c45c2311bd	Add colorized output option to generate script (#347 ) * Add colorized output option to generate script Two new functions were added to the script that allow output to be colorized based on the T[0] probability. Changes were made to the `generate_step` function in utils.py to permit colorization. Additionally, an argument for colorization was introduced to the command-line parser. * Rename 'colorize' parameter with 'return_probability' in generate_step	2024-01-23 05:25:44 -08:00
Sugato Ray	a445ac2895	Update docs with `conda` install option (#354 )	2024-01-22 21:14:48 -08:00
Baptiste Canton	42672f5446	add an option to apply the tokenizer chat template (#338 ) * add an option to apply the tokenizer chat template * fix the option to apply the tokenizer chat template * better error messages for chat template issues * apply the chat template by default when possible * nit in comment' * rebase --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-22 19:52:42 -08:00
Anchen	30be4c4734	refactor(qwen): moving qwen into mlx-lm (#312 ) * refactor(qwen): moving qwen into mlx-lm * chore: update doc * chore: fix type hint * add qwen model support in convert * chore: fix doc * chore: only load model in quantize_model * chore: make the convert script only copy tokenizer files instead of load it and save * chore: update docstring * chore: remove unnecessary try catch * chore: clean up for tokenizer and update transformers 4.37 * nits in README --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-22 15:00:07 -08:00
Anchen	527cea4027	chore: fix the convert.py script for weights are not sanitized and support quant for non-32 dimensions (#340 ) * chore: fix convert script for weights not sanitized and suport quant for non 32 dim * Update llms/mlx_lm/utils.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * chore: fix typo --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-01-19 21:07:21 -08:00
bojanbabic	61297f547b	Missing requirements needed for convert script (#320 ) * fix requirements and add eos parameter * fix black * address comment * address comments - remove new arg	2024-01-18 19:04:24 -08:00
Awni Hannun	bcc9fc3581	two minor fixes (#335 )	2024-01-18 14:18:13 -08:00
someone	2287294723	fix mlx_lm generator for chinese (#321 ) * fix generator for chinese * add REPLACEMENT_CHAR --------- Co-authored-by: cg <cg@qq.com>	2024-01-16 07:13:33 -08:00
Awni Hannun	b0870ed679	fix response + bump version (#319 )	2024-01-15 11:51:21 -08:00
Anchen	195bec2fa3	feat(mlx_lm): add mixtral support in mlx_lm (#318 ) * feat: add mixtral support in mlx_lm * chore: update doc	2024-01-15 07:18:14 -08:00
Marcel Bischoff	cd3cff0858	Phixtral (#290 ) * initial * file * remove debug * Adding README * typo * simplify readme * nits in readmes --------- Co-authored-by: Marcel Bischoff <marcel.bischoff@awarehq.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-13 08:35:03 -08:00
Anchen	a39b735c3b	chore(mlx-lm): update phi2 model args to sync with hf config format. (#311 ) * chore(mlx-lm): update phi2 model args to sync with hf config format * chore: fix type hint	2024-01-13 07:51:45 -08:00
Pedro Cuenca	ef93979973	Update model card uploaded with converted models (#309 )	2024-01-12 13:03:52 -08:00
Angelos Katharopoulos	1fa40067fe	Change tuple type definitions to use Tuple (#308 )	2024-01-12 11:15:09 -08:00
Awni Hannun	c6440416a2	Mlx llm package (#301 ) * fix converter * add recursive files * remove gitignore * remove gitignore * add packages properly * read me update * remove dup readme * relative * fix convert * fix community name * fix url * version	2024-01-12 10:25:56 -08:00
Anchen	6217d7acd0	Delete llms/hf_llm/models/.gitignore (#300 )	2024-01-11 16:56:50 -08:00
Anchen	a2402116ae	refactor(hf_llm): moving phi2 example into hf_llm (#293 ) * refactor: moving phi2 example into hf_llm * chore: clean up * chore: update phi2 model args so it can load args from config * fix phi2 + nits + readme * allow any HF repo, update README * fix bug in llama --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-11 12:29:12 -08:00
Anchen	7380ebfb0d	fix: undefined hf_path (#292 )	2024-01-11 05:53:52 -08:00
Konstantin Kerekovski	047d4650c4	Add -local flag to llms/hf_llm/convert.py for reading source HF models from filesystem. (#260 ) * * Add --local flag for reading models from filesystem and related code for doing so * Disable uploading to huggingface if --local flag is set * Remove code related to .bin files and merge fetch_from_local and fetch_from_hub into one function. * Update llms/hf_llm/convert.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * format / nits --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-10 19:53:01 -08:00
Alwin Arrasyid	2bbe9d3bd8	fix use of args in generate function (#284 )	2024-01-10 08:09:21 -08:00
Awni Hannun	7b258f33ac	Move lora example to use the same model format / conversion as `hf_llm` (#252 ) * huffing face the lora example to allow more models * fixes * comments * more readme nits * fusion + works better for qlora * nits' * comments	2024-01-09 11:14:52 -08:00
Alwin Arrasyid	6e6eff326e	fix: use of undefined args in generate function in phi-2 example (#265 )	2024-01-09 06:43:59 -08:00
Anchen	6e5b0de4d3	refactor: make the phi2 example can be directly load the model from hf without convert needed (#253 ) * refactor: make the phi2 example can be directly load the model from hf without convert needed * chore: add super().__init__() for all module, otherwise will cause error in lora	2024-01-08 06:01:23 -08:00
Nino Risteski	9742ad0f51	Update README.md (#248 ) fixed a few typos	2024-01-07 20:13:58 -08:00
Nino Risteski	b152d12d7b	Update README.md (#243 ) a few typos	2024-01-06 11:44:49 -08:00
Anchen	758f05c09a	refactor: merge deepseek coder example into hf_llm example (#234 ) * refactor: merge deepseek coder example into hf_llm example * remove deepseek example * chore: fix format in readme * chore: remove default rope_scaling dict and use get to access type and factor to avoid key error * Update llms/hf_llm/models.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * chore: fix lint --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-01-06 07:53:46 -08:00
Awni Hannun	cf0ad26a89	force fp16 for quantized models (#240 )	2024-01-05 21:29:15 -08:00
Awni Hannun	37b41cec60	Qlora (#219 ) qlora	2024-01-04 21:05:59 -08:00
Christian Bieniak	4fa659acbd	Handle receiving 0 tokens gracefully (#231 ) * handle 0 tokens gracefully * Formatting * Move no token check to statistics section	2024-01-04 19:14:13 -08:00
Andy Peatling	12c9bafbf5	Update README.md to fix --hf-model param call. (#229 ) Update `--hf-model` to `--hf-path` since the `--hf-model` param does not exist in convert.py.	2024-01-04 11:53:51 -08:00
Awni Hannun	e14afb3e77	fix to use actual prompt (#227 )	2024-01-04 11:12:05 -08:00
Vaibhav Srivastav	f95cf30a31	Fix upload to hub for HF LLMs conversion script. (#221 ) * Fix upload to hub snippet. * Weights -> model. * reverting last commit.	2024-01-04 06:06:05 -08:00
Awni Hannun	a5d6d0436c	Support Hugging Face models (#215 ) * support hf direct models	2024-01-03 15:13:26 -08:00
Daniel Strobusch	1d09c4fecd	keep dtype on model conversion (#186 )	2024-01-02 11:20:29 -08:00
Daniel Strobusch	85258b2be7	make parameter naming consistent with other examples. (#214 )	2024-01-02 08:18:12 -08:00
Anchen	e632d7aaaa	fix: deepseek coder tokenizer error (#211 )	2024-01-01 06:10:37 -08:00
Anchen	ee3c44d231	chore: make the Deepseek example compatible with Yi models. (#205 ) * Update convert.py * Update convert.py * Update deepseek_coder.py	2023-12-30 06:11:33 -08:00
Anchen	1cdbf9e886	chore: fix the load quantization model for deepseek coder (#203 ) * chore: fix the load quantization model * change to explicitly check for quantization config	2023-12-29 05:25:38 -08:00
Anchen	31ddbd7806	add deepseek coder example (#172 ) * feat: add example for deepseek coder * chore: remove hardcoded rope_scaling_factor * feat: add quantization support * chore: update readme * chore: clean up the rope scalling factor param in create cos sin theta * feat: add repetition_penalty * style /consistency changes to ease future integration * nits in README * one more typo --------- Co-authored-by: Awni Hannun <awni@apple.com>	2023-12-28 21:42:22 -08:00
Benjamin Anderson	09566c7257	add speculative decoding example for llama (#149 ) * speculative decoding * add sample 0 * spec decode gives same results as regular decode * rebase * use accept reject criteria * switch to t5 * update readme * readme nit * nits * nits * nits --------- Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan> Co-authored-by: Awni Hannun <awni@apple.com>	2023-12-28 15:20:43 -08:00
Sunbir Gill	78d207fe27	Fix generate example in README (#197 )	2023-12-27 13:11:10 -08:00
Sushant	a516f4635d	Fixed the return type for the __call__ method in Attention (#190 )	2023-12-26 09:32:43 -08:00
Daniel Strobusch	2bd20ef0e0	shard llama model after conversion and unshard on loading (#174 )	2023-12-25 11:19:43 -08:00
Yifan	738448c2d4	QWEN: Fix unsupported ScalarType BFloat16 (#187 ) Fix unsupported ScalarType BFloat16.	2023-12-25 06:10:01 -08:00
devonthomas35	939086e6a3	Mixtral: Stop at EOS token (#183 ) * Stop at EOS token * Precommit format files * Fix precommit hooks * Fix precommit hooks	2023-12-23 21:25:42 -08:00
Daniel Strobusch	848f118ac5	use non-zero exit code on error (#177 )	2023-12-23 07:10:13 -08:00
Daniel Strobusch	092e87211e	fix bad convert parameter (#178 )	2023-12-23 07:09:49 -08:00
Alvaro Bartolome	f4709cb807	Align CLI args and some smaller fixes (#167 ) * Add `.DS_Store` files to `.gitignore` * Fix variable naming of `config` in `mixtral/convert.py` * Align CLI args and minor fixes * standardize * one more --------- Co-authored-by: Awni Hannun <awni@apple.com>	2023-12-22 14:34:32 -08:00
Vaibhav Srivastav	0eaa323c10	Fix conversion + inference errors. - Mistral (#176 ) * Fix conversion + inference errors. * wire rope_theta throuugh to nn.RoPE --------- Co-authored-by: Awni Hannun <awni@apple.com>	2023-12-22 14:10:25 -08:00
Todsaporn Banjerdkit	7ae445f6c7	feat: add mistral tps (#173 ) * feat: add mistral tps * eval params before timing + format --------- Co-authored-by: Awni Hannun <awni@apple.com>	2023-12-22 07:55:57 -08:00
Awni Hannun	3cf436b529	Quantize example (#162 ) * testing quantization * conversion + quantization working * one config processor * quantization in mistral / nits in llama * args for quantization * llama / mistral conversion in good shape * phi2 quantized * mixtral * qwen conversion	2023-12-21 12:59:37 -08:00
Deven Mistry	6c574dbecf	update path to load weights (#164 )	2023-12-21 06:31:17 -08:00
Daniel Strobusch	43b6522af2	rename --model_path to --model-path (#151 ) use same argument convention for mistral/mixtral as for llama convert.	2023-12-21 06:28:57 -08:00
Deven Mistry	3efb1cc2cc	fix typo in readme (#163 )	2023-12-20 19:47:41 -08:00
Pedro Cuenca	ce30cc3d8f	Use config.json in llama (#159 ) * Use config.json in llama * Fix pop * Fix convert * Typo	2023-12-20 10:34:44 -08:00
Awni Hannun	27c0a8c002	Add llms subdir + update README (#145 ) * add llms subdir + update README * nits * use same pre-commit as mlx * update readmes a bit * format	2023-12-20 10:22:25 -08:00
Junyi Mei	62b455f801	Add Qwen example (#134 ) * Add qwen model draft * Add readme and requirements for qwen example * Add model and tokenizer options * Fix convert and tokenizer * some updates / style consistency * move to llm subdir * readme nit --------- Co-authored-by: Awni Hannun <awni@apple.com>	2023-12-19 13:06:19 -08:00

1 2 3 4

188 Commits