mlx-examples/llms/mlx_lm/models at 5a6ada2df0e197480d8b0cf8bf51efb9646fd1e0 - mlx-examples - Gitea for Geophysics

zhangyiss/mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Files

History

Goekdeniz-Guelmez 5a6ada2df0 getting reall closer:

python -m mlx_lm.generate --model /Users/gokdenizgulmez/Desktop/Mamba-Codestral-7B-v0.1-4bit --prompt "# A function that computes fibonacci
def fibonacci(" -m 64
==========
n):
    print(f"{os.path.abspath(".")/data/data/data/com.android.launcher.png)

## 🙌🏼 🙌🙌🙌🙌🙌🙌

class _State(Enum):
    def __init__ (self
==========
Prompt: 16 tokens, 84.547 tokens-per-sec
Generation: 64 tokens, 13.774 tokens-per-sec
Peak memory: 4.139 GB

2025-01-21 20:44:51 +01:00

..

__init__.py

Mlx llm package (#301 )

2024-01-12 10:25:56 -08:00

base.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

cache.py

Fixed streaming generation and got rid of generating gibberish, but is still a litle slow: 0.222 tokens-per-sec

2024-11-21 22:01:28 +01:00

cohere2.py

Fix Cohere2: mask shape error (long context) (#1202 )

2025-01-12 12:58:08 -08:00

cohere.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

dbrx.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

deepseek_v2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

deepseek_v3.py

deepseek v3 model with pipeline parallelism (#1191 )

2025-01-09 15:55:53 -08:00

deepseek.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

exaone.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gemma2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gemma.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gpt2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

gpt_bigcode.py

fix gpt bigcode (#1204 )

2025-01-13 10:22:32 -08:00

gpt_neox.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

hunyuan.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

internlm2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

internlm3.py

add internlm3 (#1206 )

2025-01-15 14:55:41 -08:00

llama.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

mamba2.py

getting reall closer:

2025-01-21 20:44:51 +01:00

mamba.py

Add support for falcon-mamba (#1074 )

2024-11-04 12:23:30 -08:00

minicpm.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

mixtral.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

nemotron.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

olmo2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

olmo.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

openelm.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phi3.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phi3small.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phi.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phimoe.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

phixtral.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

plamo.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

qwen2_moe.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

qwen2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

qwen.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

recurrent_gemma.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

rope_utils.py

Adds EXAONE architecture. (#1145 )

2024-12-09 07:58:25 -08:00

stablelm.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

starcoder2.py

Length masking for batch inputs (#1173 )

2024-12-18 19:43:52 -08:00

su_rope.py

Add Phi-3.5-MoE (#946 )

2024-08-24 06:52:33 -07:00

switch_layers.py

Handle longer prompt/generation (#931 )

2024-08-16 15:28:39 -07:00