Commit Graph

8 Commits

Author SHA1 Message Date
Awni Hannun
ee60e2a9d5
Kv cache (#643)
* in place kv_cache

* fix

* fix kv cache size

* partially fix kv cache dtype

* step kv cache

* multiple of step size

* more teests + kv cache

* more kv cache

* udpate all models to use kv cache
2024-05-08 08:18:13 -07:00
Awni Hannun
2146bcd7ee
Quantize embedding / Update quantize API (#680)
* more async eval

* quantize embedding / update quantize api

* more updates for quantize

* update for quantize embeddings

* update sd quant API

* update sdxl quants

* error for datasets < batch_size

* async

* fix config loading

* fix quant

* fix tests

* fix req

* remove lm head if tie weights is true

* fix test
2024-04-18 18:16:10 -07:00
Chime Ogbuji
f6283ef7ce
Configurable LR schedulers (#604)
* Initial config handler and test

* Added means to run from CLI

* Update lora config loading and tests

* Constrain scheduler config (warmup and minimum LR) for each kind

* Update reference to moved schedule_config module

* Minor fix

* Fix typos

* Moved build_schedule and tests

* nits in schedule config

* flake

* fix path

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-29 13:41:10 -07:00
Awni Hannun
b8a348c1b8
Switch to fast RMS/LN Norm (#603)
* use nn.RMSNorm, use sdpa, cleanup

* bump mlx versions

* minor update

* use fast layer norm

* version bump

* update requirement for whisper

* update requirement for gguf
2024-03-23 07:13:51 -07:00
Awni Hannun
8b05bb6d18
[mlx-lm] Use sdpa in llama / mistral model (#515)
* use sdpa

* update a few more models

* version

* fix stablelm type
2024-03-07 17:41:23 -08:00
Awni Hannun
7cdd1b69ac
Enable unit testing in Circle and start some MLX LM tests (#545)
* add a few tests for mlx lm

* add a few tests for mlx lm

* add a few tests for mlx lm

* more tests / cleanup
2024-03-07 09:31:57 -08:00
Awni Hannun
f24edfa9dc
[mlx-lm] Add precompiled normalizations (#451)
* add precompiled normalizations

* nits
2024-02-22 12:40:55 -08:00
Awni Hannun
ab9172baac
Gemma support (#474)
* gemma support

* format

* lora support for gemma
2024-02-21 08:47:13 -08:00