Awni Hannun
e4b19bb9e1
Make attention faster for a some models ( #574 )
...
* make attention faster for a couple models
* remove unused generation flags
* add comment on lora
* include text files as well
2024-03-14 21:35:54 -07:00
Awni Hannun
7cdd1b69ac
Enable unit testing in Circle and start some MLX LM tests ( #545 )
...
* add a few tests for mlx lm
* add a few tests for mlx lm
* add a few tests for mlx lm
* more tests / cleanup
2024-03-07 09:31:57 -08:00
Awni Hannun
f24edfa9dc
[mlx-lm] Add precompiled normalizations ( #451 )
...
* add precompiled normalizations
* nits
2024-02-22 12:40:55 -08:00
Awni Hannun
8fd953ee2b
Support for slerp merging models ( #455 )
...
* support for slerp merging models
* docs
* update docs
* format'
2024-02-19 20:37:15 -08:00
Angelos Katharopoulos
f71e965d57
Change gqa to use repeat instead of concatenate ( #443 )
2024-02-14 17:40:11 -08:00
Awni Hannun
d4666615bb
Lazy import + refactor Lora layer addition ( #426 )
...
* lazy model import in mlx_lm
* change lora loading
* fix olmo lora
* remove a bunch of unused stuff from plamo
* move phixtral to mlx-lm and out of llms/
2024-02-12 10:51:02 -08:00