mlx-examples

zhangyiss/mlx-examples

Fork 0

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Files

History

Goekdeniz-Guelmez 313d4a2ac9 summarize segsum

2025-02-28 15:04:03 +01:00

__init__.py

Mlx llm package (#301 )

2024-01-12 10:25:56 -08:00

base.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

cache.py

Fixed streaming generation and got rid of generating gibberish, but is still a litle slow: 0.222 tokens-per-sec

2024-11-21 22:01:28 +01:00

cohere2.py

Fix Cohere2: mask shape error (long context) (#1202 )

2025-01-12 12:58:08 -08:00

cohere.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

dbrx.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

deepseek_v2.py

fix sharding for more even number of layers (#1276 )

2025-02-11 16:26:59 -08:00

deepseek_v3.py

fix sharding for more even number of layers (#1276 )

2025-02-11 16:26:59 -08:00

deepseek.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

exaone.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gemma2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gemma.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gpt2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gpt_bigcode.py

fix gpt bigcode (#1204 )

2025-01-13 10:22:32 -08:00

gpt_neox.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

granite.py

Add IBM granite model (#1265 )

2025-02-08 15:46:15 -08:00

helium.py

Optimizations for mamba1 (#1213 )

2025-02-03 13:36:08 -08:00

hunyuan.py

support hunyuan 7b (#1263 )

2025-02-08 15:46:47 -08:00

internlm2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

internlm3.py

add internlm3 (#1206 )

2025-01-15 14:55:41 -08:00

llama.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

mamba2_pytorch.py

correct segsum function

2025-02-26 14:46:46 +01:00

mamba2.py

summarize segsum

2025-02-28 15:04:03 +01:00

mamba.py

Optimizations for mamba1 (#1213 )

2025-02-03 13:36:08 -08:00

minicpm.py

Optimizations for mamba1 (#1213 )

2025-02-03 13:36:08 -08:00

mixtral.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

nemotron.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

olmo2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

olmo.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

openelm.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phi3.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phi3small.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phi.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phimoe.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phixtral.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

plamo2.py

Add plamo-2-1b model (#1283 )

2025-02-24 19:24:43 -08:00

plamo.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

qwen2_moe.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

qwen2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

qwen.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

recurrent_gemma.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

rope_utils.py

Adds EXAONE architecture. (#1145 )

2024-12-09 07:58:25 -08:00

stablelm.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

starcoder2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

su_rope.py

Add Phi-3.5-MoE (#946 )

2024-08-24 06:52:33 -07:00

switch_layers.py

Handle longer prompt/generation (#931 )

2024-08-16 15:28:39 -07:00