mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Files

Angelos Katharopoulos 9f671228cd Block sparse MM MoEs (#782 )

- Adds SwitchLinear
- Adds QuantizedSwitchLinear

2024-05-21 15:58:08 -07:00

__init__.py

Mlx llm package (#301 )

2024-01-12 10:25:56 -08:00

base.py

Support non incremental kv cache growth (#766 )

2024-05-15 12:56:24 -07:00

cohere.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

dbrx.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

gemma.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

llama.py

Fix llama cache check (#763 )

2024-05-08 08:35:54 -07:00

minicpm.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

mixtral.py

Block sparse MM MoEs (#782 )

2024-05-21 15:58:08 -07:00

olmo.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

openelm.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

phi3.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

phi.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

phixtral.py

Block sparse MM MoEs (#782 )

2024-05-21 15:58:08 -07:00

plamo.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

qwen2_moe.py

Block sparse MM MoEs (#782 )

2024-05-21 15:58:08 -07:00

qwen2.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

qwen.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

stablelm.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

starcoder2.py

Kv cache (#643 )

2024-05-08 08:18:13 -07:00

switch_layers.py

Block sparse MM MoEs (#782 )

2024-05-21 15:58:08 -07:00