mlx-examples/llms/mlx_lm/models
Prince Canuma d661440dbb
Add support for qwen2moe (#640)
* add sparsemoe block and update decoder logic

* update file name to match HF

* update name

* Code formatting

* update gates calculation

* add support for Qwen2MoE.

* fix pytest

* code formatting and fix missing comma in utils

* Remove decoder sparse step.

Co-authored-by: bozheng-hit <dsoul0621@gmail.com>

* remove gate layer anti-quantisation

* remove unused argument

---------

Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
2024-04-02 11:33:29 -07:00
..
__init__.py Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
base.py Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
cohere.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00
dbrx.py DBRX (#628) 2024-03-28 21:03:53 -07:00
gemma.py Configurable LR schedulers (#604) 2024-03-29 13:41:10 -07:00
llama.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00
mixtral.py DBRX (#628) 2024-03-28 21:03:53 -07:00
olmo.py Configurable LR schedulers (#604) 2024-03-29 13:41:10 -07:00
phi.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00
phixtral.py DBRX (#628) 2024-03-28 21:03:53 -07:00
plamo.py Configurable LR schedulers (#604) 2024-03-29 13:41:10 -07:00
qwen2_moe.py Add support for qwen2moe (#640) 2024-04-02 11:33:29 -07:00
qwen2.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00
qwen.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00
stablelm.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00
starcoder2.py Switch to fast RMS/LN Norm (#603) 2024-03-23 07:13:51 -07:00