Commit Graph

398 Commits

Author SHA1 Message Date
Anchen
e632d7aaaa
fix: deepseek coder tokenizer error (#211) 2024-01-01 06:10:37 -08:00
Anchen
ee3c44d231
chore: make the Deepseek example compatible with Yi models. (#205)
* Update convert.py

* Update convert.py

* Update deepseek_coder.py
2023-12-30 06:11:33 -08:00
bofeng huang
581a5733a1
[Whisper] Load customized MLX model & Quantization (#191)
* Add option to load customized mlx model

* Add quantization

* Apply reviews

* Separate model conversion and loading

* Update test

* Fix benchmark

* Add notes about conversion

* Improve doc
2023-12-29 10:22:15 -08:00
Anchen
1cdbf9e886
chore: fix the load quantization model for deepseek coder (#203)
* chore: fix the load quantization model

* change to explicitly check for quantization config
2023-12-29 05:25:38 -08:00
Anchen
31ddbd7806
add deepseek coder example (#172)
* feat: add example for deepseek coder

* chore: remove hardcoded rope_scaling_factor

* feat: add quantization support

* chore: update readme

* chore: clean up the rope scalling factor param in create cos sin theta

* feat: add repetition_penalty

* style /consistency changes to ease future integration

* nits in README

* one more typo

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 21:42:22 -08:00
Angelos Katharopoulos
37fd2464dc
Add an image2image example in the stable diffusion (#198) 2023-12-28 18:31:45 -08:00
Benjamin Anderson
09566c7257
add speculative decoding example for llama (#149)
* speculative decoding

* add sample 0

* spec decode gives same results as regular decode

* rebase

* use accept reject criteria

* switch to t5

* update readme

* readme nit

* nits

* nits

* nits

---------

Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 15:20:43 -08:00
Dimo
07c163d9d9
[Whisper] Large-v3 requires 128 Mel frequency bins (#193)
* Large-v3 requires 128 Mel frequency bins

* extract correct model dimensions and use argparse

* format

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 13:50:35 -08:00
bofeng huang
e1e56a625b
Fix benchmark (#200) 2023-12-28 11:29:39 -08:00
Sunbir Gill
78d207fe27
Fix generate example in README (#197) 2023-12-27 13:11:10 -08:00
Jiří Moravčík
50fceb1a28
fix: Add numpy to CIFAR's requirements.txt (#192) 2023-12-26 15:18:59 -08:00
Sushant
a516f4635d
Fixed the return type for the __call__ method in Attention (#190) 2023-12-26 09:32:43 -08:00
Daniel Strobusch
2bd20ef0e0
shard llama model after conversion and unshard on loading (#174) 2023-12-25 11:19:43 -08:00
Yifan
738448c2d4
QWEN: Fix unsupported ScalarType BFloat16 (#187)
Fix unsupported ScalarType BFloat16.
2023-12-25 06:10:01 -08:00
Vidyasagar Bhargava
647e48870a
updated README (#184) 2023-12-24 06:19:53 -08:00
devonthomas35
939086e6a3
Mixtral: Stop at EOS token (#183)
* Stop at EOS token

* Precommit format files

* Fix precommit hooks

* Fix precommit hooks
2023-12-23 21:25:42 -08:00
Kashif Rasul
0371d90ccb
fashion-mnist example (#180)
* fashion mnist example

* fix from review
2023-12-23 07:34:45 -08:00
Daniel Strobusch
848f118ac5
use non-zero exit code on error (#177) 2023-12-23 07:10:13 -08:00
Daniel Strobusch
092e87211e
fix bad convert parameter (#178) 2023-12-23 07:09:49 -08:00
Alvaro Bartolome
f4709cb807
Align CLI args and some smaller fixes (#167)
* Add `.DS_Store` files to `.gitignore`

* Fix variable naming of `config` in `mixtral/convert.py`

* Align CLI args and minor fixes

* standardize

* one more

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:34:32 -08:00
Vaibhav Srivastav
0eaa323c10
Fix conversion + inference errors. - Mistral (#176)
* Fix conversion + inference errors.

* wire rope_theta throuugh to nn.RoPE

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Todsaporn Banjerdkit
7ae445f6c7
feat: add mistral tps (#173)
* feat: add mistral tps

* eval params before timing + format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 07:55:57 -08:00
Daniel Strobusch
188a91074b
fix typo (#169) 2023-12-21 14:17:11 -08:00
Awni Hannun
3cf436b529
Quantize example (#162)
* testing quantization

* conversion + quantization working

* one config processor

* quantization in mistral / nits in llama

* args for quantization

* llama / mistral conversion in good shape

* phi2 quantized

* mixtral

* qwen conversion
2023-12-21 12:59:37 -08:00
Juarez Bochi
4c9db80ed2
Add support for byt5 models (#161)
* Add support for byt5 models

* Remove unused import
2023-12-21 08:46:36 -08:00
Deven Mistry
6c574dbecf
update path to load weights (#164) 2023-12-21 06:31:17 -08:00
Sarthak Yadav
4addd02988
updated results (#165) 2023-12-21 06:30:17 -08:00
wyanzhao
22620de3ee
1. Add user warning for sequences over 2048 tokens in iterate_batches. (#166) 2023-12-21 06:29:31 -08:00
Daniel Strobusch
43b6522af2
rename --model_path to --model-path (#151)
use same argument convention for mistral/mixtral as for llama convert.
2023-12-21 06:28:57 -08:00
Deven Mistry
3efb1cc2cc
fix typo in readme (#163) 2023-12-20 19:47:41 -08:00
Pedro Cuenca
ce30cc3d8f
Use config.json in llama (#159)
* Use config.json in llama

* Fix pop

* Fix convert

* Typo
2023-12-20 10:34:44 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README (#145)
* add llms subdir + update README

* nits

* use same pre-commit as mlx

* update readmes a bit

* format
2023-12-20 10:22:25 -08:00
Vaibhav Srivastav
aed14618ca
Add config.json to Mixtral. (#158)
* Add config.json to Mixtral.

* Update mixtral/mixtral.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-12-20 09:47:23 -08:00
Pedro Cuenca
730c50d00a
Use config.json, add model_type (#157)
* Use config.json, add model_type

* Update convert to generate config.json
2023-12-20 08:39:37 -08:00
Vaibhav Srivastav
4b7e11bd31
Add URLs to HF MLX-Community org. (#153)
* up

* Add ref to MLX org on the README.

* nit: language.

* Standardise org name.
2023-12-20 06:57:13 -08:00
Pedro Cuenca
d8e14c858e
Add --model_path to phi-2 example script (#152) 2023-12-20 06:14:35 -08:00
Sarthak Yadav
b6e62caf2e
Added Keyword Spotting Transformer + SpeechCommands example (#123)
* Added Keyword Transformer + SpeechCommands

* minor fixes in README

* some updates / simplifications

* nits

* fixed kwt skip connections

* readme + format

* updated acknowledgements

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-19 14:17:48 -08:00
Juarez Bochi
ebbb7083cc
T5: Change default dtype to bfloat16 (#147)
* T5: Change default to bfloat16

* Add myself to contributors

* t5: Change convert.py default to float32
2023-12-19 13:44:36 -08:00
Junyi Mei
62b455f801
Add Qwen example (#134)
* Add qwen model draft

* Add readme and requirements for qwen example

* Add model and tokenizer options

* Fix convert and tokenizer

* some updates / style consistency

* move to llm subdir

* readme nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-19 13:06:19 -08:00
Juarez Bochi
10a7b99e83
Add T5 and Flan-T5 example (#113)
* Add skeleton

* Load all encoder weights

* Pass config to all modules, fix ln

* Load position bias embeddings

* Load decoder weights

* Move position biases to attention module

* translate pytorch to mx

* Fix default prompt

* Fix relative_attention_max_distance config

* No scaling, no encoder mask

* LM head

* Decode (broken after 1st token)

* Use position bias in all layers

* Utils to compare encoder output

* Fix layer norm

* Fix decoder mask

* Use position bias in decoder

* Concatenate tokens

* Remove prints

* Stop on eos

* Measure tokens/s

* with cache

* bug fix with bidirectional only for encoder, add offset to position bias

* format

* Fix T5.__call__

* Stream output

* Add argument to generate float16 npz

* Load config from HF to support any model

* Uncomment bidirectional param

* Add gitignore

* Add readme.md for t5

* Fix relative position scale

* Fix --encode-only

* Run hf_t5 with any model

* Add hf generation for comparison

* Fix type for attention mask

* Increase hf max_length

* Rescale output before projecting on vocab

* readme updates

* nits

* Pass ln2 to cross attention

* Fix example

* Fix attention for 3b model

* fp16, abstract tokenizer a bit, format

* clamp for low precision

* higher clipping, remove non-helpful casts

* default to fp32 for now

* Adds support for flan-t5

* Update t5 docs on variant support

* readme flan

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-18 20:25:34 -08:00
Awni Hannun
1e7f4a5921
fix use for llama 2 from meta (#144) 2023-12-18 19:33:17 -08:00
Daniel Strobusch
1d62b3ecc1
Pass few shot file name to --few-shot arg(#141) 2023-12-18 13:30:04 -08:00
Awni Hannun
517f5808fc
Citation + contributor acknowledgments section (#136)
* citation + acks section

* nits
2023-12-18 10:12:35 -08:00
Daniel Strobusch
f0e14b6341
fix renamed arg (#140) 2023-12-18 10:11:51 -08:00
Awni Hannun
44b546d446
support for tiny llama (#129) 2023-12-18 07:47:55 -08:00
Awni Hannun
08e862336a
Rope theta to support Coda Llama (#121)
* rope theta for llama model

* llama chat/code

* nit
2023-12-15 19:51:51 -08:00
Awni Hannun
db134d976d
Merge pull request #115 from ml-explore/lora_custom
Customize dataset with lora
2023-12-15 13:54:58 -08:00
Awni Hannun
8df211869e minimum version 2023-12-15 13:54:31 -08:00
Pawel Kowalski
fc1495abaa
Stable diffusion - check model weights shape and support int for "attention_head_dim" (#85)
* Allow integer as attention_head_dim
* Reshape downloaded weights to match model if there is a mismatch
2023-12-15 13:01:02 -08:00
Awni Hannun
86cae9ba57
Merge pull request #116 from idoru/fix-phi-2-temp-arg
phi-2: fix --temp/--seed arguments.
2023-12-15 12:29:19 -08:00