mlx-examples/llms/mlx_lm/models
Awni Hannun e4b19bb9e1
Make attention faster for a some models (#574)
* make attention faster for a couple models

* remove unused generation flags

* add comment on lora

* include text files as well
2024-03-14 21:35:54 -07:00
..
__init__.py Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
base.py Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
cohere.py Add support for Cohere's Command-R (#565) 2024-03-13 07:03:36 -07:00
gemma.py [mlx-lm] Use sdpa in llama / mistral model (#515) 2024-03-07 17:41:23 -08:00
layers.py Add support for Cohere's Command-R (#565) 2024-03-13 07:03:36 -07:00
llama.py chore(mlx-lm): fix tie_word_embeddings for qwen2 (#566) 2024-03-12 21:34:32 -07:00
mixtral.py Make attention faster for a some models (#574) 2024-03-14 21:35:54 -07:00
olmo.py Enable unit testing in Circle and start some MLX LM tests (#545) 2024-03-07 09:31:57 -08:00
phi.py Make attention faster for a some models (#574) 2024-03-14 21:35:54 -07:00
phixtral.py Make attention faster for a some models (#574) 2024-03-14 21:35:54 -07:00
plamo.py Enable unit testing in Circle and start some MLX LM tests (#545) 2024-03-07 09:31:57 -08:00
qwen2.py chore(mlx-lm): fix tie_word_embeddings for qwen2 (#566) 2024-03-12 21:34:32 -07:00
qwen.py Enable unit testing in Circle and start some MLX LM tests (#545) 2024-03-07 09:31:57 -08:00
stablelm.py [mlx-lm] Use sdpa in llama / mistral model (#515) 2024-03-07 17:41:23 -08:00
starcoder2.py chore(mlx-lm): fix tie_word_embeddings for qwen2 (#566) 2024-03-12 21:34:32 -07:00