Awni Hannun
2146bcd7ee
Quantize embedding / Update quantize API ( #680 )
...
* more async eval
* quantize embedding / update quantize api
* more updates for quantize
* update for quantize embeddings
* update sd quant API
* update sdxl quants
* error for datasets < batch_size
* async
* fix config loading
* fix quant
* fix tests
* fix req
* remove lm head if tie weights is true
* fix test
2024-04-18 18:16:10 -07:00
Awni Hannun
b8a348c1b8
Switch to fast RMS/LN Norm ( #603 )
...
* use nn.RMSNorm, use sdpa, cleanup
* bump mlx versions
* minor update
* use fast layer norm
* version bump
* update requirement for whisper
* update requirement for gguf
2024-03-23 07:13:51 -07:00
bojanbabic
61297f547b
Missing requirements needed for convert script ( #320 )
...
* fix requirements and add eos parameter
* fix black
* address comment
* address comments - remove new arg
2024-01-18 19:04:24 -08:00
Vaibhav Srivastav
0eaa323c10
Fix conversion + inference errors. - Mistral ( #176 )
...
* Fix conversion + inference errors.
* wire rope_theta throuugh to nn.RoPE
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-22 14:10:25 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README ( #145 )
...
* add llms subdir + update README
* nits
* use same pre-commit as mlx
* update readmes a bit
* format
2023-12-20 10:22:25 -08:00