Commit Graph

31 Commits

Author SHA1 Message Date
Awni Hannun
cf0ad26a89
force fp16 for quantized models (#240) 2024-01-05 21:29:15 -08:00
Awni Hannun
37b41cec60
Qlora (#219)
qlora
2024-01-04 21:05:59 -08:00
Christian Bieniak
4fa659acbd
Handle receiving 0 tokens gracefully (#231)
* handle 0 tokens gracefully

* Formatting

* Move no token check to statistics section
2024-01-04 19:14:13 -08:00
Andy Peatling
12c9bafbf5
Update README.md to fix --hf-model param call. (#229)
Update `--hf-model` to `--hf-path` since the `--hf-model` param does not exist in convert.py.
2024-01-04 11:53:51 -08:00
Awni Hannun
e14afb3e77
fix to use actual prompt (#227) 2024-01-04 11:12:05 -08:00
Vaibhav Srivastav
f95cf30a31
Fix upload to hub for HF LLMs conversion script. (#221)
* Fix upload to hub snippet.

* Weights -> model.

* reverting last commit.
2024-01-04 06:06:05 -08:00
Awni Hannun
a5d6d0436c
Support Hugging Face models (#215)
* support hf direct models
2024-01-03 15:13:26 -08:00
Daniel Strobusch
1d09c4fecd
keep dtype on model conversion (#186) 2024-01-02 11:20:29 -08:00
Daniel Strobusch
85258b2be7
make parameter naming consistent with other examples. (#214) 2024-01-02 08:18:12 -08:00
Anchen
e632d7aaaa
fix: deepseek coder tokenizer error (#211) 2024-01-01 06:10:37 -08:00
Anchen
ee3c44d231
chore: make the Deepseek example compatible with Yi models. (#205)
* Update convert.py

* Update convert.py

* Update deepseek_coder.py
2023-12-30 06:11:33 -08:00
Anchen
1cdbf9e886
chore: fix the load quantization model for deepseek coder (#203)
* chore: fix the load quantization model

* change to explicitly check for quantization config
2023-12-29 05:25:38 -08:00
Anchen
31ddbd7806
add deepseek coder example (#172)
* feat: add example for deepseek coder

* chore: remove hardcoded rope_scaling_factor

* feat: add quantization support

* chore: update readme

* chore: clean up the rope scalling factor param in create cos sin theta

* feat: add repetition_penalty

* style /consistency changes to ease future integration

* nits in README

* one more typo

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 21:42:22 -08:00
Benjamin Anderson
09566c7257
add speculative decoding example for llama (#149)
* speculative decoding

* add sample 0

* spec decode gives same results as regular decode

* rebase

* use accept reject criteria

* switch to t5

* update readme

* readme nit

* nits

* nits

* nits

---------

Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 15:20:43 -08:00
Sunbir Gill
78d207fe27
Fix generate example in README (#197) 2023-12-27 13:11:10 -08:00
Sushant
a516f4635d
Fixed the return type for the __call__ method in Attention (#190) 2023-12-26 09:32:43 -08:00
Daniel Strobusch
2bd20ef0e0
shard llama model after conversion and unshard on loading (#174) 2023-12-25 11:19:43 -08:00
Yifan
738448c2d4
QWEN: Fix unsupported ScalarType BFloat16 (#187)
Fix unsupported ScalarType BFloat16.
2023-12-25 06:10:01 -08:00
devonthomas35
939086e6a3
Mixtral: Stop at EOS token (#183)
* Stop at EOS token

* Precommit format files

* Fix precommit hooks

* Fix precommit hooks
2023-12-23 21:25:42 -08:00
Daniel Strobusch
848f118ac5
use non-zero exit code on error (#177) 2023-12-23 07:10:13 -08:00
Daniel Strobusch
092e87211e
fix bad convert parameter (#178) 2023-12-23 07:09:49 -08:00
Alvaro Bartolome
f4709cb807
Align CLI args and some smaller fixes (#167)
* Add `.DS_Store` files to `.gitignore`

* Fix variable naming of `config` in `mixtral/convert.py`

* Align CLI args and minor fixes

* standardize

* one more

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:34:32 -08:00
Vaibhav Srivastav
0eaa323c10
Fix conversion + inference errors. - Mistral (#176)
* Fix conversion + inference errors.

* wire rope_theta throuugh to nn.RoPE

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Todsaporn Banjerdkit
7ae445f6c7
feat: add mistral tps (#173)
* feat: add mistral tps

* eval params before timing + format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 07:55:57 -08:00
Awni Hannun
3cf436b529
Quantize example (#162)
* testing quantization

* conversion + quantization working

* one config processor

* quantization in mistral / nits in llama

* args for quantization

* llama / mistral conversion in good shape

* phi2 quantized

* mixtral

* qwen conversion
2023-12-21 12:59:37 -08:00
Deven Mistry
6c574dbecf
update path to load weights (#164) 2023-12-21 06:31:17 -08:00
Daniel Strobusch
43b6522af2
rename --model_path to --model-path (#151)
use same argument convention for mistral/mixtral as for llama convert.
2023-12-21 06:28:57 -08:00
Deven Mistry
3efb1cc2cc
fix typo in readme (#163) 2023-12-20 19:47:41 -08:00
Pedro Cuenca
ce30cc3d8f
Use config.json in llama (#159)
* Use config.json in llama

* Fix pop

* Fix convert

* Typo
2023-12-20 10:34:44 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README (#145)
* add llms subdir + update README

* nits

* use same pre-commit as mlx

* update readmes a bit

* format
2023-12-20 10:22:25 -08:00
Junyi Mei
62b455f801
Add Qwen example (#134)
* Add qwen model draft

* Add readme and requirements for qwen example

* Add model and tokenizer options

* Fix convert and tokenizer

* some updates / style consistency

* move to llm subdir

* readme nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-19 13:06:19 -08:00