Commit Graph

91 Commits

Author SHA1 Message Date
Awni Hannun
06ddb8414d
Fix Qwen2 and SD (#441)
* fix qwen2

* version bump

* fix list shape
2024-02-14 13:43:12 -08:00
Chime Ogbuji
e446598f62
Passing parameterized loss and batching to trainer (#391) 2024-02-13 07:03:25 -08:00
Madroid Ma
954aa50c54
LoRA: Improve validation error for LoRA layer count exceeding model layer (#427)
* LoRA: Improve validation error for LoRA layer count exceeding model layer

This commit enhances the error handling when the specified LoRA layer count exceeds the total number of layers in the model. It clarifies the error message to provide actionable feedback for users, guiding them to adjust their input parameters accordingly.

* format + nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-02-13 06:56:27 -08:00
Awni Hannun
d4666615bb
Lazy import + refactor Lora layer addition (#426)
* lazy model import in mlx_lm

* change lora loading

* fix olmo lora

* remove a bunch of unused stuff from plamo

* move phixtral to mlx-lm and out of llms/
2024-02-12 10:51:02 -08:00
Ivan Fioravanti
4576946151
Add checkpoints directory for adapter weights (#431)
* Add checkpoints directory for adapter weights

The code was modified to create a checkpoints directory if it doesn't exist yet. Adapter weights are now saved to this checkpoints directory during the training iterations.
Corrected indentation of Save adapter weights code because it was part of "if eval"

* Fixing a blank added by mistake
2024-02-12 10:50:05 -08:00
Nripesh Niketan
f1ef378a58
Feat: update pre-commit rev (#432) 2024-02-11 07:23:27 -08:00
Awni Hannun
f45a1ab83c
Update a few examples to use compile (#420)
* update a few examples to use compile

* update mnist

* add compile to vae and rename some stuff for simplicity

* update reqs

* use state in eval

* GCN example with RNG + dropout

* add a bit of prefetching
2024-02-08 13:00:41 -08:00
Anchen
da7adae5ec
fix(mlx-m): lazy load hf_olmo (#424) 2024-02-08 09:02:43 -08:00
Markus Enzweiler
9b387007ab
Example of a Convolutional Variational Autoencoder (CVAE) on MNIST (#264)
* initial commit

* style fixes

* update of ACKNOWLEDGMENTS

* fixed comment

* minor refactoring; removed unused imports

* added cifar and cvae to top-level README.md

* removed mention of cuda/mps in argparse

* fixed training status output

* load_weights() with strict=True

* pretrained model update

* fixed imports and style

* requires mlx>=0.0.9

* updated with results using mlx 0.0.9

* removed mention of private repo

* simplify and combine to one file, more consistency with other exmaples

* few more nits

* nits

* spell

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-02-06 20:02:27 -08:00
Long Sha
8071aacd98
fix-mistral-download-link (#418) 2024-02-06 19:56:56 -08:00
Chris McMaster
2303238e44
Update olmo.py (#419)
exit should be imported outside of interactive mode
2024-02-06 16:16:46 -08:00
Anchen
8b77677c05
chore(mlx-lm): add model weight index in save_weights (#413)
* chore(mlx-lm): add model weight index in save_weights

* Update llms/mlx_lm/utils.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* Update llms/mlx_lm/utils.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* chore: save total siZe as param size isntead of file size

* chore: clean up format

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-02-06 05:32:15 -08:00
Anchen
a7d139f484
fix(mlx-lm): olmo 1b model (#417) 2024-02-06 05:27:05 -08:00
Awni Hannun
aa7447efa2
Olmo in MLX LM (#415)
* run olmo

* format
2024-02-05 21:13:49 -08:00
Ivan Fioravanti
7fbca214b1
Add max sequence length argument in lora.py (#408)
A new argument "--max_seq_length" has been added to the command-line parser and passed as a parameter to the main function of the lora.py script. This allows users to specify and control the maximum sequence length during training.
2024-02-04 12:28:21 -08:00
Junyang Lin
9d0dd34403
add qwen2 (#411) 2024-02-04 08:31:38 -08:00
Madroid Ma
ba3a9355d1
LoRA: Remove unnecessary model type judgments (#388)
* LoRA: Remove unnecessary model type judgments

1. Supported models are already checked in the load_model function in utils, no need to repeat the check in lora
2. The checks in lora are not synchronized with those in utils

* LoRA: add LoRA supported models in mlx_lm utils
2024-01-31 11:55:27 -08:00
Anchen
0a49ba0697
fix(mlx-lm): apply lora layer doesn't update the lora weights (#396) 2024-01-31 11:51:26 -08:00
Sugato Ray
ab8bde1590
Add py.typed to support PEP-561 (type-hinting) (#389)
This adds support for type-hinting information as laid in [PEP-561](https://peps.python.org/pep-0561/).
2024-01-30 21:17:38 -08:00
David Koski
f8fadf7a17
Fix token count computation to fix tps measurements (#392) 2024-01-30 11:24:16 -08:00
Anchen
614de6652f
chore(mlx-lm): add reset lora layers helper (#377)
* chore(mlx-lm): add reset lora layers helper

* chore: rename the func

* chore: update docstring

* Update llms/mlx_lm/tuner/utils.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-01-29 20:54:49 -08:00
Ashish
20b969b412
Replace time.time() with time.perf_counter() as it is more suited for benchmarking (#380) 2024-01-26 14:11:38 -08:00
Awni Hannun
5aa652d3c2
remove simplify (#379) 2024-01-26 13:54:49 -08:00
Ashish
0b57f0eae6
Add StableLM-2 1.6B (#378)
* init

* stablelm

* add to readme

* bump version

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-26 10:28:00 -08:00
Anchen
854ad8747a
feat(mlx-lm): add de-quant for fuse.py (#365)
* feat(mlx-lm): add de-quant for fuse

* chore: disable quant in to linear when de-quant enabled

* chore: add better error handling for adapter file not found
2024-01-25 18:59:32 -08:00
Anchen
f51e98fcf1
chore(mlx-lm): truncate the input sentence to max seq len in lora iterate_batches (#373)
* chore(mlx-lm): pass max seq len to evaluate in training loop

* chore: make sure the batch seq not exceed max len

* chore: update comment

* chore: add warning before truncate input
2024-01-25 12:38:04 -08:00
Anchen
b1dec281b3
feat(mlx-lm): add lora hypeparameters in lora layer (#366)
* feat(mlx-lm): add lora hypeparameters in lora layer

* chore: address comments
2024-01-24 08:11:25 -08:00
Anchen
5fc8668a53
fix(mlx-lm): handle legacy quant models (#369) 2024-01-24 07:44:05 -08:00
Anchen
ab91ac1075
chore(mlx-lm): add load model with adapter and fix bug in sample (#360)
* chore: add load model with adapter support and fix bug in sample

* chore: ignore temp during calculating prob in sample
2024-01-23 19:47:39 -08:00
Juarez Bochi
f5b80c95fb
Example reading directly from gguf file (#222)
* Draft of tiny llama from gguf

* Transpose all

* No transposition with new layout

* Read config from gguf

* Create tokenizer from gguf

* move gguf and update to be similar to hf_llm

* change model to HF style + updates to REAMDE

* nits in REAMDE

* nit readme

* only use mlx for metadata

* fix eos/bos tokenizer

* fix tokenization

* quantization runs

* 8-bit works

* tokenizer fix

* bump mlx version

---------

Co-authored-by: Juarez Bochi <juarez.bochi@grammarly.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 15:41:54 -08:00
iLoveBug
40b61c1719
fix the chinese character generation as same as PR #321 (#342)
* fix the chinese character generation as same as PR #321

* reuse the generate logic to utils.py

* format

* verbose defualt

* fix conflicst with colorize and character check

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 12:44:23 -08:00
Awni Hannun
21aa8038fb
MLX LM version bump (#358)
* version bump

* include new package
2024-01-23 09:05:57 -08:00
Anchen
362e88a744
feat: move lora into mlx-lm (#337)
* feat: Add lora and qlora training to mlx-lm


---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 08:44:37 -08:00
Shunta Saito
85c1ff8fd6
Add PLaMo-13B model as an LLM example (#303)
* Convert HF weights of PLaMo and load it to a plamo model in mlx

* Fix model inference part

* Add bos at the beginning of the prompt

* Fix convert.py to copy tokenizer.model into the converted dir

* Use the required insturction format in generate.py when "--instruct" option is specified

* Change filenames and update existing scripts

* Add README

* Add requirements.txt

* Fix plamo.py to stop generation when EOS appears

* Add quantization to convert.py

* Use mlx>=0.0.9 for mx.core.outer() in PLaMo model

* Update acknowledgements.md

* Fix card text in upload_to_hub()

* Not use prompt template when --instruct is not specified

* Ask if you trust_remote_code for loading tokenizer of PLaMo

* Check the user trusts the remote code when converting

* Remove plamo directory

* Update README

* Add PLaMo model file

* Fix the handling of cache in PLaMo and update README

* Ask if trust_remote_code only when the model is PLaMo

* Remove resolve_trust_remote_code from convert.py and use the latest transformers

* Remove code not to add EOS

* Update README to fix an example not to use noncommercial version of the model

* Remove unused imports

* Remove unnecessary description about the instruct model of PLaMo from README

* format, nits in README

* typo

---------

Co-authored-by: Shunta Saito <shunta@mitmul-mbp.local>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 07:17:24 -08:00
Ivan Fioravanti
c45c2311bd
Add colorized output option to generate script (#347)
* Add colorized output option to generate script

Two new functions were added to the script that allow output to be colorized based on the T[0] probability. Changes were made to the `generate_step` function in utils.py to permit colorization. Additionally, an argument for colorization was introduced to the command-line parser.

* Rename 'colorize' parameter with 'return_probability' in generate_step
2024-01-23 05:25:44 -08:00
Sugato Ray
a445ac2895
Update docs with conda install option (#354) 2024-01-22 21:14:48 -08:00
Baptiste Canton
42672f5446
add an option to apply the tokenizer chat template (#338)
* add an option to apply the tokenizer chat template

* fix the option to apply the tokenizer chat template

* better error messages for chat template issues

* apply the chat template by default when possible

* nit in comment'

* rebase

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-22 19:52:42 -08:00
Anchen
30be4c4734
refactor(qwen): moving qwen into mlx-lm (#312)
* refactor(qwen): moving qwen into mlx-lm

* chore: update doc

* chore: fix type hint

* add qwen model support in convert

* chore: fix doc

* chore: only load model in quantize_model

* chore: make the convert script only copy tokenizer files instead of load it and save

* chore: update docstring

* chore: remove unnecessary try catch

* chore: clean up for tokenizer and update  transformers 4.37

* nits in README

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-22 15:00:07 -08:00
Anchen
527cea4027
chore: fix the convert.py script for weights are not sanitized and support quant for non-32 dimensions (#340)
* chore: fix convert script for weights not sanitized and suport quant for non 32 dim

* Update llms/mlx_lm/utils.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* chore: fix typo

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-01-19 21:07:21 -08:00
bojanbabic
61297f547b
Missing requirements needed for convert script (#320)
* fix requirements and add eos parameter

* fix black

* address comment

* address comments - remove new arg
2024-01-18 19:04:24 -08:00
Awni Hannun
bcc9fc3581
two minor fixes (#335) 2024-01-18 14:18:13 -08:00
someone
2287294723
fix mlx_lm generator for chinese (#321)
* fix generator for chinese

* add REPLACEMENT_CHAR

---------

Co-authored-by: cg <cg@qq.com>
2024-01-16 07:13:33 -08:00
Awni Hannun
b0870ed679
fix response + bump version (#319) 2024-01-15 11:51:21 -08:00
Anchen
195bec2fa3
feat(mlx_lm): add mixtral support in mlx_lm (#318)
* feat: add mixtral support in mlx_lm

* chore: update doc
2024-01-15 07:18:14 -08:00
Marcel Bischoff
cd3cff0858
Phixtral (#290)
* initial

* file

* remove debug

* Adding README

* typo

* simplify readme

* nits in readmes

---------

Co-authored-by: Marcel Bischoff <marcel.bischoff@awarehq.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-13 08:35:03 -08:00
Anchen
a39b735c3b
chore(mlx-lm): update phi2 model args to sync with hf config format. (#311)
* chore(mlx-lm): update phi2 model args to sync with hf config format

* chore: fix type hint
2024-01-13 07:51:45 -08:00
Pedro Cuenca
ef93979973
Update model card uploaded with converted models (#309) 2024-01-12 13:03:52 -08:00
Angelos Katharopoulos
1fa40067fe
Change tuple type definitions to use Tuple (#308) 2024-01-12 11:15:09 -08:00
Awni Hannun
c6440416a2
Mlx llm package (#301)
* fix converter

* add recursive files

* remove gitignore

* remove gitignore

* add packages properly

* read me update

* remove dup readme

* relative

* fix convert

* fix community name

* fix url

* version
2024-01-12 10:25:56 -08:00
Anchen
6217d7acd0
Delete llms/hf_llm/models/.gitignore (#300) 2024-01-11 16:56:50 -08:00