673 Commits

Author SHA1 Message Date
Anchen
7cfda327fd fix(lora): tokenizer return incompatible mx array (#271)
* fix(lora): tokenizer return incompatible encodeing mx array

* add readme nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-09 19:46:38 -08:00
Awni Hannun
7b258f33ac Move lora example to use the same model format / conversion as hf_llm (#252)
* huffing face the lora example to allow more models

* fixes

* comments

* more readme nits

* fusion + works better for qlora

* nits'

* comments
2024-01-09 11:14:52 -08:00
Awni Hannun
bbd7172eef Some fixes / cleanup for BERT example (#269)
* some fixes/cleaning for bert + test

* nit
2024-01-09 08:44:51 -08:00
Awni Hannun
6759dfddf1 Fix SD image conversion (#266) 2024-01-09 08:41:31 -08:00
Alwin Arrasyid
6e6eff326e fix: use of undefined args in generate function in phi-2 example (#265) 2024-01-09 06:43:59 -08:00
Vaibhav Srivastav
bb35e878cb [Whisper] Add load from Hub. (#255)
* Add load from Hub.

* Up.
2024-01-08 06:20:00 -08:00
Vaibhav Srivastav
d4c3a9cb54 [Whisper] Add HF Hub upload option. (#254)
* Add HF Hub upload option.

* up.

* Add missing requirements.
2024-01-08 06:18:24 -08:00
Anchen
6e5b0de4d3 refactor: make the phi2 example can be directly load the model from hf without convert needed (#253)
* refactor: make the phi2 example can be directly load the model from hf without convert needed

* chore: add super().__init__() for all module, otherwise will cause error in lora
2024-01-08 06:01:23 -08:00
Nino Risteski
9742ad0f51 Update README.md (#248)
fixed a few typos
2024-01-07 20:13:58 -08:00
Awni Hannun
485fb9ac0f quantize linear (#250) 2024-01-07 18:48:59 -08:00
Ikko Eltociear Ashimine
737b4c81a3 Update README.md (#251)
minor fix
2024-01-07 11:35:39 -08:00
bofeng huang
bf9926489e [Whisper] Add word timestamps and confidence scores (#201)
* Add word timestamps and confidence scores

* Create a separate forward_with_cross_qk function

* Move multiple ops from np to mlx, clean comments

* Save alignment_heads

* Cast qk to fp32

* Add test for word-level timestamps and confidence scores

* format + readme

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-07 10:01:29 -08:00
mc0ps
25ebd36112 Fix typo in lora convert.py (#245) 2024-01-07 03:30:30 -08:00
Nino Risteski
b152d12d7b Update README.md (#243)
a few typos
2024-01-06 11:44:49 -08:00
Anchen
758f05c09a refactor: merge deepseek coder example into hf_llm example (#234)
* refactor: merge deepseek coder example into hf_llm example

* remove deepseek example

* chore: fix format in readme

* chore: remove default rope_scaling dict and use get to access type and factor to avoid key error

* Update llms/hf_llm/models.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* chore: fix lint

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-01-06 07:53:46 -08:00
Awni Hannun
cf0ad26a89 force fp16 for quantized models (#240) 2024-01-05 21:29:15 -08:00
Lawrence Wu
37856f70a8 add numpy as a requirement to run lora.py (#238)
* add numpy as a requirement to run lora.py

* removed unused imports
2024-01-05 16:16:28 -08:00
Awni Hannun
37b41cec60 Qlora (#219)
qlora
2024-01-04 21:05:59 -08:00
Christian Bieniak
4fa659acbd Handle receiving 0 tokens gracefully (#231)
* handle 0 tokens gracefully

* Formatting

* Move no token check to statistics section
2024-01-04 19:14:13 -08:00
Andy Peatling
12c9bafbf5 Update README.md to fix --hf-model param call. (#229)
Update `--hf-model` to `--hf-path` since the `--hf-model` param does not exist in convert.py.
2024-01-04 11:53:51 -08:00
Awni Hannun
e14afb3e77 fix to use actual prompt (#227) 2024-01-04 11:12:05 -08:00
Vaibhav Srivastav
f95cf30a31 Fix upload to hub for HF LLMs conversion script. (#221)
* Fix upload to hub snippet.

* Weights -> model.

* reverting last commit.
2024-01-04 06:06:05 -08:00
Awni Hannun
a5d6d0436c Support Hugging Face models (#215)
* support hf direct models
2024-01-03 15:13:26 -08:00
Daniel Strobusch
1d09c4fecd keep dtype on model conversion (#186) 2024-01-02 11:20:29 -08:00
Daniel Strobusch
85258b2be7 make parameter naming consistent with other examples. (#214) 2024-01-02 08:18:12 -08:00
Anchen
e632d7aaaa fix: deepseek coder tokenizer error (#211) 2024-01-01 06:10:37 -08:00
Anchen
ee3c44d231 chore: make the Deepseek example compatible with Yi models. (#205)
* Update convert.py

* Update convert.py

* Update deepseek_coder.py
2023-12-30 06:11:33 -08:00
bofeng huang
581a5733a1 [Whisper] Load customized MLX model & Quantization (#191)
* Add option to load customized mlx model

* Add quantization

* Apply reviews

* Separate model conversion and loading

* Update test

* Fix benchmark

* Add notes about conversion

* Improve doc
2023-12-29 10:22:15 -08:00
Anchen
1cdbf9e886 chore: fix the load quantization model for deepseek coder (#203)
* chore: fix the load quantization model

* change to explicitly check for quantization config
2023-12-29 05:25:38 -08:00
Anchen
31ddbd7806 add deepseek coder example (#172)
* feat: add example for deepseek coder

* chore: remove hardcoded rope_scaling_factor

* feat: add quantization support

* chore: update readme

* chore: clean up the rope scalling factor param in create cos sin theta

* feat: add repetition_penalty

* style /consistency changes to ease future integration

* nits in README

* one more typo

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 21:42:22 -08:00
Angelos Katharopoulos
37fd2464dc Add an image2image example in the stable diffusion (#198) 2023-12-28 18:31:45 -08:00
Benjamin Anderson
09566c7257 add speculative decoding example for llama (#149)
* speculative decoding

* add sample 0

* spec decode gives same results as regular decode

* rebase

* use accept reject criteria

* switch to t5

* update readme

* readme nit

* nits

* nits

* nits

---------

Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 15:20:43 -08:00
Dimo
07c163d9d9 [Whisper] Large-v3 requires 128 Mel frequency bins (#193)
* Large-v3 requires 128 Mel frequency bins

* extract correct model dimensions and use argparse

* format

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 13:50:35 -08:00
bofeng huang
e1e56a625b Fix benchmark (#200) 2023-12-28 11:29:39 -08:00
Sunbir Gill
78d207fe27 Fix generate example in README (#197) 2023-12-27 13:11:10 -08:00
Jiří Moravčík
50fceb1a28 fix: Add numpy to CIFAR's requirements.txt (#192) 2023-12-26 15:18:59 -08:00
Sushant
a516f4635d Fixed the return type for the __call__ method in Attention (#190) 2023-12-26 09:32:43 -08:00
Daniel Strobusch
2bd20ef0e0 shard llama model after conversion and unshard on loading (#174) 2023-12-25 11:19:43 -08:00
Yifan
738448c2d4 QWEN: Fix unsupported ScalarType BFloat16 (#187)
Fix unsupported ScalarType BFloat16.
2023-12-25 06:10:01 -08:00
Vidyasagar Bhargava
647e48870a updated README (#184) 2023-12-24 06:19:53 -08:00
devonthomas35
939086e6a3 Mixtral: Stop at EOS token (#183)
* Stop at EOS token

* Precommit format files

* Fix precommit hooks

* Fix precommit hooks
2023-12-23 21:25:42 -08:00
Kashif Rasul
0371d90ccb fashion-mnist example (#180)
* fashion mnist example

* fix from review
2023-12-23 07:34:45 -08:00
Daniel Strobusch
848f118ac5 use non-zero exit code on error (#177) 2023-12-23 07:10:13 -08:00
Daniel Strobusch
092e87211e fix bad convert parameter (#178) 2023-12-23 07:09:49 -08:00
Alvaro Bartolome
f4709cb807 Align CLI args and some smaller fixes (#167)
* Add `.DS_Store` files to `.gitignore`

* Fix variable naming of `config` in `mixtral/convert.py`

* Align CLI args and minor fixes

* standardize

* one more

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:34:32 -08:00
Vaibhav Srivastav
0eaa323c10 Fix conversion + inference errors. - Mistral (#176)
* Fix conversion + inference errors.

* wire rope_theta throuugh to nn.RoPE

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Todsaporn Banjerdkit
7ae445f6c7 feat: add mistral tps (#173)
* feat: add mistral tps

* eval params before timing + format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 07:55:57 -08:00
Daniel Strobusch
188a91074b fix typo (#169) 2023-12-21 14:17:11 -08:00
Awni Hannun
3cf436b529 Quantize example (#162)
* testing quantization

* conversion + quantization working

* one config processor

* quantization in mistral / nits in llama

* args for quantization

* llama / mistral conversion in good shape

* phi2 quantized

* mixtral

* qwen conversion
2023-12-21 12:59:37 -08:00
Juarez Bochi
4c9db80ed2 Add support for byt5 models (#161)
* Add support for byt5 models

* Remove unused import
2023-12-21 08:46:36 -08:00