Pedro Cuenca
ef93979973
Update model card uploaded with converted models ( #309 )
2024-01-12 13:03:52 -08:00
Angelos Katharopoulos
1fa40067fe
Change tuple type definitions to use Tuple ( #308 )
2024-01-12 11:15:09 -08:00
Awni Hannun
c6440416a2
Mlx llm package ( #301 )
...
* fix converter
* add recursive files
* remove gitignore
* remove gitignore
* add packages properly
* read me update
* remove dup readme
* relative
* fix convert
* fix community name
* fix url
* version
2024-01-12 10:25:56 -08:00
Anchen
6217d7acd0
Delete llms/hf_llm/models/.gitignore ( #300 )
2024-01-11 16:56:50 -08:00
Anchen
a2402116ae
refactor(hf_llm): moving phi2 example into hf_llm ( #293 )
...
* refactor: moving phi2 example into hf_llm
* chore: clean up
* chore: update phi2 model args so it can load args from config
* fix phi2 + nits + readme
* allow any HF repo, update README
* fix bug in llama
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-11 12:29:12 -08:00
Anchen
7380ebfb0d
fix: undefined hf_path ( #292 )
2024-01-11 05:53:52 -08:00
Konstantin Kerekovski
047d4650c4
Add -local flag to llms/hf_llm/convert.py for reading source HF models from filesystem. ( #260 )
...
* * Add --local flag for reading models from filesystem and related code for doing so
* Disable uploading to huggingface if --local flag is set
* Remove code related to .bin files and merge fetch_from_local and fetch_from_hub into one function.
* Update llms/hf_llm/convert.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* format / nits
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-10 19:53:01 -08:00
Alwin Arrasyid
2bbe9d3bd8
fix use of args in generate function ( #284 )
2024-01-10 08:09:21 -08:00
Awni Hannun
7b258f33ac
Move lora example to use the same model format / conversion as hf_llm
( #252 )
...
* huffing face the lora example to allow more models
* fixes
* comments
* more readme nits
* fusion + works better for qlora
* nits'
* comments
2024-01-09 11:14:52 -08:00
Alwin Arrasyid
6e6eff326e
fix: use of undefined args in generate function in phi-2 example ( #265 )
2024-01-09 06:43:59 -08:00
Anchen
6e5b0de4d3
refactor: make the phi2 example can be directly load the model from hf without convert needed ( #253 )
...
* refactor: make the phi2 example can be directly load the model from hf without convert needed
* chore: add super().__init__() for all module, otherwise will cause error in lora
2024-01-08 06:01:23 -08:00
Nino Risteski
9742ad0f51
Update README.md ( #248 )
...
fixed a few typos
2024-01-07 20:13:58 -08:00
Nino Risteski
b152d12d7b
Update README.md ( #243 )
...
a few typos
2024-01-06 11:44:49 -08:00
Anchen
758f05c09a
refactor: merge deepseek coder example into hf_llm example ( #234 )
...
* refactor: merge deepseek coder example into hf_llm example
* remove deepseek example
* chore: fix format in readme
* chore: remove default rope_scaling dict and use get to access type and factor to avoid key error
* Update llms/hf_llm/models.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* chore: fix lint
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-01-06 07:53:46 -08:00
Awni Hannun
cf0ad26a89
force fp16 for quantized models ( #240 )
2024-01-05 21:29:15 -08:00
Awni Hannun
37b41cec60
Qlora ( #219 )
...
qlora
2024-01-04 21:05:59 -08:00
Christian Bieniak
4fa659acbd
Handle receiving 0 tokens gracefully ( #231 )
...
* handle 0 tokens gracefully
* Formatting
* Move no token check to statistics section
2024-01-04 19:14:13 -08:00
Andy Peatling
12c9bafbf5
Update README.md to fix --hf-model param call. ( #229 )
...
Update `--hf-model` to `--hf-path` since the `--hf-model` param does not exist in convert.py.
2024-01-04 11:53:51 -08:00
Awni Hannun
e14afb3e77
fix to use actual prompt ( #227 )
2024-01-04 11:12:05 -08:00
Vaibhav Srivastav
f95cf30a31
Fix upload to hub for HF LLMs conversion script. ( #221 )
...
* Fix upload to hub snippet.
* Weights -> model.
* reverting last commit.
2024-01-04 06:06:05 -08:00
Awni Hannun
a5d6d0436c
Support Hugging Face models ( #215 )
...
* support hf direct models
2024-01-03 15:13:26 -08:00
Daniel Strobusch
1d09c4fecd
keep dtype on model conversion ( #186 )
2024-01-02 11:20:29 -08:00
Daniel Strobusch
85258b2be7
make parameter naming consistent with other examples. ( #214 )
2024-01-02 08:18:12 -08:00
Anchen
e632d7aaaa
fix: deepseek coder tokenizer error ( #211 )
2024-01-01 06:10:37 -08:00
Anchen
ee3c44d231
chore: make the Deepseek example compatible with Yi models. ( #205 )
...
* Update convert.py
* Update convert.py
* Update deepseek_coder.py
2023-12-30 06:11:33 -08:00
Anchen
1cdbf9e886
chore: fix the load quantization model for deepseek coder ( #203 )
...
* chore: fix the load quantization model
* change to explicitly check for quantization config
2023-12-29 05:25:38 -08:00
Anchen
31ddbd7806
add deepseek coder example ( #172 )
...
* feat: add example for deepseek coder
* chore: remove hardcoded rope_scaling_factor
* feat: add quantization support
* chore: update readme
* chore: clean up the rope scalling factor param in create cos sin theta
* feat: add repetition_penalty
* style /consistency changes to ease future integration
* nits in README
* one more typo
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 21:42:22 -08:00
Benjamin Anderson
09566c7257
add speculative decoding example for llama ( #149 )
...
* speculative decoding
* add sample 0
* spec decode gives same results as regular decode
* rebase
* use accept reject criteria
* switch to t5
* update readme
* readme nit
* nits
* nits
* nits
---------
Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 15:20:43 -08:00
Sunbir Gill
78d207fe27
Fix generate example in README ( #197 )
2023-12-27 13:11:10 -08:00
Sushant
a516f4635d
Fixed the return type for the __call__ method in Attention ( #190 )
2023-12-26 09:32:43 -08:00
Daniel Strobusch
2bd20ef0e0
shard llama model after conversion and unshard on loading ( #174 )
2023-12-25 11:19:43 -08:00
Yifan
738448c2d4
QWEN: Fix unsupported ScalarType BFloat16 ( #187 )
...
Fix unsupported ScalarType BFloat16.
2023-12-25 06:10:01 -08:00
devonthomas35
939086e6a3
Mixtral: Stop at EOS token ( #183 )
...
* Stop at EOS token
* Precommit format files
* Fix precommit hooks
* Fix precommit hooks
2023-12-23 21:25:42 -08:00
Daniel Strobusch
848f118ac5
use non-zero exit code on error ( #177 )
2023-12-23 07:10:13 -08:00
Daniel Strobusch
092e87211e
fix bad convert parameter ( #178 )
2023-12-23 07:09:49 -08:00
Alvaro Bartolome
f4709cb807
Align CLI args and some smaller fixes ( #167 )
...
* Add `.DS_Store` files to `.gitignore`
* Fix variable naming of `config` in `mixtral/convert.py`
* Align CLI args and minor fixes
* standardize
* one more
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:34:32 -08:00
Vaibhav Srivastav
0eaa323c10
Fix conversion + inference errors. - Mistral ( #176 )
...
* Fix conversion + inference errors.
* wire rope_theta throuugh to nn.RoPE
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Todsaporn Banjerdkit
7ae445f6c7
feat: add mistral tps ( #173 )
...
* feat: add mistral tps
* eval params before timing + format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 07:55:57 -08:00
Awni Hannun
3cf436b529
Quantize example ( #162 )
...
* testing quantization
* conversion + quantization working
* one config processor
* quantization in mistral / nits in llama
* args for quantization
* llama / mistral conversion in good shape
* phi2 quantized
* mixtral
* qwen conversion
2023-12-21 12:59:37 -08:00
Deven Mistry
6c574dbecf
update path to load weights ( #164 )
2023-12-21 06:31:17 -08:00
Daniel Strobusch
43b6522af2
rename --model_path to --model-path ( #151 )
...
use same argument convention for mistral/mixtral as for llama convert.
2023-12-21 06:28:57 -08:00
Deven Mistry
3efb1cc2cc
fix typo in readme ( #163 )
2023-12-20 19:47:41 -08:00
Pedro Cuenca
ce30cc3d8f
Use config.json in llama ( #159 )
...
* Use config.json in llama
* Fix pop
* Fix convert
* Typo
2023-12-20 10:34:44 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README ( #145 )
...
* add llms subdir + update README
* nits
* use same pre-commit as mlx
* update readmes a bit
* format
2023-12-20 10:22:25 -08:00
Junyi Mei
62b455f801
Add Qwen example ( #134 )
...
* Add qwen model draft
* Add readme and requirements for qwen example
* Add model and tokenizer options
* Fix convert and tokenizer
* some updates / style consistency
* move to llm subdir
* readme nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-19 13:06:19 -08:00