Angelos Katharopoulos
37fd2464dc
Add an image2image example in the stable diffusion ( #198 )
2023-12-28 18:31:45 -08:00
Benjamin Anderson
09566c7257
add speculative decoding example for llama ( #149 )
...
* speculative decoding
* add sample 0
* spec decode gives same results as regular decode
* rebase
* use accept reject criteria
* switch to t5
* update readme
* readme nit
* nits
* nits
* nits
---------
Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 15:20:43 -08:00
Dimo
07c163d9d9
[Whisper] Large-v3 requires 128 Mel frequency bins ( #193 )
...
* Large-v3 requires 128 Mel frequency bins
* extract correct model dimensions and use argparse
* format
* format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 13:50:35 -08:00
bofeng huang
e1e56a625b
Fix benchmark ( #200 )
2023-12-28 11:29:39 -08:00
Sunbir Gill
78d207fe27
Fix generate example in README ( #197 )
2023-12-27 13:11:10 -08:00
Jiří Moravčík
50fceb1a28
fix: Add numpy to CIFAR's requirements.txt ( #192 )
2023-12-26 15:18:59 -08:00
Sushant
a516f4635d
Fixed the return type for the __call__ method in Attention ( #190 )
2023-12-26 09:32:43 -08:00
Daniel Strobusch
2bd20ef0e0
shard llama model after conversion and unshard on loading ( #174 )
2023-12-25 11:19:43 -08:00
Yifan
738448c2d4
QWEN: Fix unsupported ScalarType BFloat16 ( #187 )
...
Fix unsupported ScalarType BFloat16.
2023-12-25 06:10:01 -08:00
Vidyasagar Bhargava
647e48870a
updated README ( #184 )
2023-12-24 06:19:53 -08:00
devonthomas35
939086e6a3
Mixtral: Stop at EOS token ( #183 )
...
* Stop at EOS token
* Precommit format files
* Fix precommit hooks
* Fix precommit hooks
2023-12-23 21:25:42 -08:00
Kashif Rasul
0371d90ccb
fashion-mnist example ( #180 )
...
* fashion mnist example
* fix from review
2023-12-23 07:34:45 -08:00
Daniel Strobusch
848f118ac5
use non-zero exit code on error ( #177 )
2023-12-23 07:10:13 -08:00
Daniel Strobusch
092e87211e
fix bad convert parameter ( #178 )
2023-12-23 07:09:49 -08:00
Alvaro Bartolome
f4709cb807
Align CLI args and some smaller fixes ( #167 )
...
* Add `.DS_Store` files to `.gitignore`
* Fix variable naming of `config` in `mixtral/convert.py`
* Align CLI args and minor fixes
* standardize
* one more
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:34:32 -08:00
Vaibhav Srivastav
0eaa323c10
Fix conversion + inference errors. - Mistral ( #176 )
...
* Fix conversion + inference errors.
* wire rope_theta throuugh to nn.RoPE
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Todsaporn Banjerdkit
7ae445f6c7
feat: add mistral tps ( #173 )
...
* feat: add mistral tps
* eval params before timing + format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 07:55:57 -08:00
Daniel Strobusch
188a91074b
fix typo ( #169 )
2023-12-21 14:17:11 -08:00
Awni Hannun
3cf436b529
Quantize example ( #162 )
...
* testing quantization
* conversion + quantization working
* one config processor
* quantization in mistral / nits in llama
* args for quantization
* llama / mistral conversion in good shape
* phi2 quantized
* mixtral
* qwen conversion
2023-12-21 12:59:37 -08:00
Juarez Bochi
4c9db80ed2
Add support for byt5 models ( #161 )
...
* Add support for byt5 models
* Remove unused import
2023-12-21 08:46:36 -08:00
Deven Mistry
6c574dbecf
update path to load weights ( #164 )
2023-12-21 06:31:17 -08:00
Sarthak Yadav
4addd02988
updated results ( #165 )
2023-12-21 06:30:17 -08:00
wyanzhao
22620de3ee
1. Add user warning for sequences over 2048 tokens in iterate_batches. ( #166 )
2023-12-21 06:29:31 -08:00
Daniel Strobusch
43b6522af2
rename --model_path to --model-path ( #151 )
...
use same argument convention for mistral/mixtral as for llama convert.
2023-12-21 06:28:57 -08:00
Deven Mistry
3efb1cc2cc
fix typo in readme ( #163 )
2023-12-20 19:47:41 -08:00
Pedro Cuenca
ce30cc3d8f
Use config.json in llama ( #159 )
...
* Use config.json in llama
* Fix pop
* Fix convert
* Typo
2023-12-20 10:34:44 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README ( #145 )
...
* add llms subdir + update README
* nits
* use same pre-commit as mlx
* update readmes a bit
* format
2023-12-20 10:22:25 -08:00
Vaibhav Srivastav
aed14618ca
Add config.json to Mixtral. ( #158 )
...
* Add config.json to Mixtral.
* Update mixtral/mixtral.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-12-20 09:47:23 -08:00
Pedro Cuenca
730c50d00a
Use config.json, add model_type ( #157 )
...
* Use config.json, add model_type
* Update convert to generate config.json
2023-12-20 08:39:37 -08:00
Vaibhav Srivastav
4b7e11bd31
Add URLs to HF MLX-Community org. ( #153 )
...
* up
* Add ref to MLX org on the README.
* nit: language.
* Standardise org name.
2023-12-20 06:57:13 -08:00
Pedro Cuenca
d8e14c858e
Add --model_path
to phi-2 example script ( #152 )
2023-12-20 06:14:35 -08:00
Sarthak Yadav
b6e62caf2e
Added Keyword Spotting Transformer + SpeechCommands example ( #123 )
...
* Added Keyword Transformer + SpeechCommands
* minor fixes in README
* some updates / simplifications
* nits
* fixed kwt skip connections
* readme + format
* updated acknowledgements
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-19 14:17:48 -08:00
Juarez Bochi
ebbb7083cc
T5: Change default dtype to bfloat16 ( #147 )
...
* T5: Change default to bfloat16
* Add myself to contributors
* t5: Change convert.py default to float32
2023-12-19 13:44:36 -08:00
Junyi Mei
62b455f801
Add Qwen example ( #134 )
...
* Add qwen model draft
* Add readme and requirements for qwen example
* Add model and tokenizer options
* Fix convert and tokenizer
* some updates / style consistency
* move to llm subdir
* readme nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-19 13:06:19 -08:00
Juarez Bochi
10a7b99e83
Add T5 and Flan-T5 example ( #113 )
...
* Add skeleton
* Load all encoder weights
* Pass config to all modules, fix ln
* Load position bias embeddings
* Load decoder weights
* Move position biases to attention module
* translate pytorch to mx
* Fix default prompt
* Fix relative_attention_max_distance config
* No scaling, no encoder mask
* LM head
* Decode (broken after 1st token)
* Use position bias in all layers
* Utils to compare encoder output
* Fix layer norm
* Fix decoder mask
* Use position bias in decoder
* Concatenate tokens
* Remove prints
* Stop on eos
* Measure tokens/s
* with cache
* bug fix with bidirectional only for encoder, add offset to position bias
* format
* Fix T5.__call__
* Stream output
* Add argument to generate float16 npz
* Load config from HF to support any model
* Uncomment bidirectional param
* Add gitignore
* Add readme.md for t5
* Fix relative position scale
* Fix --encode-only
* Run hf_t5 with any model
* Add hf generation for comparison
* Fix type for attention mask
* Increase hf max_length
* Rescale output before projecting on vocab
* readme updates
* nits
* Pass ln2 to cross attention
* Fix example
* Fix attention for 3b model
* fp16, abstract tokenizer a bit, format
* clamp for low precision
* higher clipping, remove non-helpful casts
* default to fp32 for now
* Adds support for flan-t5
* Update t5 docs on variant support
* readme flan
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-18 20:25:34 -08:00
Awni Hannun
1e7f4a5921
fix use for llama 2 from meta ( #144 )
2023-12-18 19:33:17 -08:00
Daniel Strobusch
1d62b3ecc1
Pass few shot file name to --few-shot arg( #141 )
2023-12-18 13:30:04 -08:00
Awni Hannun
517f5808fc
Citation + contributor acknowledgments section ( #136 )
...
* citation + acks section
* nits
2023-12-18 10:12:35 -08:00
Daniel Strobusch
f0e14b6341
fix renamed arg ( #140 )
2023-12-18 10:11:51 -08:00
Awni Hannun
44b546d446
support for tiny llama ( #129 )
2023-12-18 07:47:55 -08:00
Awni Hannun
08e862336a
Rope theta to support Coda Llama ( #121 )
...
* rope theta for llama model
* llama chat/code
* nit
2023-12-15 19:51:51 -08:00
Awni Hannun
db134d976d
Merge pull request #115 from ml-explore/lora_custom
...
Customize dataset with lora
2023-12-15 13:54:58 -08:00
Awni Hannun
8df211869e
minimum version
2023-12-15 13:54:31 -08:00
Pawel Kowalski
fc1495abaa
Stable diffusion - check model weights shape and support int for "attention_head_dim" ( #85 )
...
* Allow integer as attention_head_dim
* Reshape downloaded weights to match model if there is a mismatch
2023-12-15 13:01:02 -08:00
Awni Hannun
86cae9ba57
Merge pull request #116 from idoru/fix-phi-2-temp-arg
...
phi-2: fix --temp/--seed arguments.
2023-12-15 12:29:19 -08:00
Awni Hannun
ff0f172363
32 GB example
2023-12-15 12:20:15 -08:00
Awni Hannun
ee2ee0f8e5
32 GB example
2023-12-15 12:18:29 -08:00
Sam Coward
877f88dfea
Pass along temp argument to generate()
2023-12-15 15:16:41 -05:00
Awni Hannun
8c8f9d6440
keep base weights in fp16
2023-12-15 10:42:18 -08:00
Awni Hannun
84f02ef58b
use lower precision base weights
2023-12-15 10:29:42 -08:00