Param Thakkar
4c9f9f9be7
Made llama and mistral files mypy compatible ( #1359 )
...
* Made mypy compatible
* reformatted
* Added more fixes
* Added fixes to speculative-decoding
* Fixes
* fix circle
* revert some stuff
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2025-04-23 14:23:46 -07:00
Awni Hannun
2146bcd7ee
Quantize embedding / Update quantize API ( #680 )
...
* more async eval
* quantize embedding / update quantize api
* more updates for quantize
* update for quantize embeddings
* update sd quant API
* update sdxl quants
* error for datasets < batch_size
* async
* fix config loading
* fix quant
* fix tests
* fix req
* remove lm head if tie weights is true
* fix test
2024-04-18 18:16:10 -07:00
Awni Hannun
b8a348c1b8
Switch to fast RMS/LN Norm ( #603 )
...
* use nn.RMSNorm, use sdpa, cleanup
* bump mlx versions
* minor update
* use fast layer norm
* version bump
* update requirement for whisper
* update requirement for gguf
2024-03-23 07:13:51 -07:00
Awni Hannun
5aa652d3c2
remove simplify ( #379 )
2024-01-26 13:54:49 -08:00
bojanbabic
61297f547b
Missing requirements needed for convert script ( #320 )
...
* fix requirements and add eos parameter
* fix black
* address comment
* address comments - remove new arg
2024-01-18 19:04:24 -08:00
Awni Hannun
37b41cec60
Qlora ( #219 )
...
qlora
2024-01-04 21:05:59 -08:00
Awni Hannun
a5d6d0436c
Support Hugging Face models ( #215 )
...
* support hf direct models
2024-01-03 15:13:26 -08:00
Daniel Strobusch
1d09c4fecd
keep dtype on model conversion ( #186 )
2024-01-02 11:20:29 -08:00
Anchen
31ddbd7806
add deepseek coder example ( #172 )
...
* feat: add example for deepseek coder
* chore: remove hardcoded rope_scaling_factor
* feat: add quantization support
* chore: update readme
* chore: clean up the rope scalling factor param in create cos sin theta
* feat: add repetition_penalty
* style /consistency changes to ease future integration
* nits in README
* one more typo
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 21:42:22 -08:00
Sushant
a516f4635d
Fixed the return type for the __call__ method in Attention ( #190 )
2023-12-26 09:32:43 -08:00
Daniel Strobusch
2bd20ef0e0
shard llama model after conversion and unshard on loading ( #174 )
2023-12-25 11:19:43 -08:00
Daniel Strobusch
848f118ac5
use non-zero exit code on error ( #177 )
2023-12-23 07:10:13 -08:00
Alvaro Bartolome
f4709cb807
Align CLI args and some smaller fixes ( #167 )
...
* Add `.DS_Store` files to `.gitignore`
* Fix variable naming of `config` in `mixtral/convert.py`
* Align CLI args and minor fixes
* standardize
* one more
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:34:32 -08:00
Vaibhav Srivastav
0eaa323c10
Fix conversion + inference errors. - Mistral ( #176 )
...
* Fix conversion + inference errors.
* wire rope_theta throuugh to nn.RoPE
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Awni Hannun
3cf436b529
Quantize example ( #162 )
...
* testing quantization
* conversion + quantization working
* one config processor
* quantization in mistral / nits in llama
* args for quantization
* llama / mistral conversion in good shape
* phi2 quantized
* mixtral
* qwen conversion
2023-12-21 12:59:37 -08:00
Pedro Cuenca
ce30cc3d8f
Use config.json in llama ( #159 )
...
* Use config.json in llama
* Fix pop
* Fix convert
* Typo
2023-12-20 10:34:44 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README ( #145 )
...
* add llms subdir + update README
* nits
* use same pre-commit as mlx
* update readmes a bit
* format
2023-12-20 10:22:25 -08:00