Commit Graph

9 Commits

Author SHA1 Message Date
Awni Hannun
2146bcd7ee
Quantize embedding / Update quantize API (#680)
* more async eval

* quantize embedding / update quantize api

* more updates for quantize

* update for quantize embeddings

* update sd quant API

* update sdxl quants

* error for datasets < batch_size

* async

* fix config loading

* fix quant

* fix tests

* fix req

* remove lm head if tie weights is true

* fix test
2024-04-18 18:16:10 -07:00
Awni Hannun
9c5554d8ee
Use async eval (#670)
* Use async eval

* bump

* bump

* remove workaround for bfloat cumsum
2024-04-11 13:18:23 -07:00
Awni Hannun
c68aa3c7c3
Stable lm 2 (#666)
* stable lm 2

* test and lora

* version bump

* merge stable models
2024-04-08 14:18:55 -07:00
Awni Hannun
c386dd5f5a
Fix for cohere plus (#650)
* fix for cohere plus

* version bump
2024-04-05 14:11:24 -07:00
Awni Hannun
2bd64b78cf
Save lora config (#636)
* lora config

* comments

* version bump
2024-04-02 13:52:53 -07:00
Anchen
fe96ef342f
feat(mlx-lm): export the GGUF (fp16) format model weights from fuse.py (#555)
* wip

* wip

* feat: convert mlx model to gguf f16

* chore: conver norm layer to float32 to avoid overflow issue

* chore: add support for mixtral

* chore: clean up

* chore: remove unused import statement

* chore: clean up weight name mapping

* version and readme

* actual version bump

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-21 10:34:11 -07:00
Awni Hannun
14fe868825
version (#570) 2024-03-13 10:09:36 -07:00
Awni Hannun
8b05bb6d18
[mlx-lm] Use sdpa in llama / mistral model (#515)
* use sdpa

* update a few more models

* version

* fix stablelm type
2024-03-07 17:41:23 -08:00
Awni Hannun
95f82e67a2
Fix import warning (#479)
* fix import warning
* fix version import
* remove api, move convert to utils
* also update circle to run external PRs
2024-02-27 08:47:56 -08:00