mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-08-10 19:26:46 +08:00

History

Burak Budanur f691e00e5a Corrected the typo in 'ffn_dim_multiplier' in and added 'rope_theta' to the list unused. Without these, llama examples did not run.		2023-12-14 14:02:11 +01:00
..
convert.py	llama v2 with sharded weights	2023-12-12 12:48:15 -08:00
llama.py	Corrected the typo in 'ffn_dim_multiplier' in and added 'rope_theta' to the list unused. Without these, llama examples did not run.	2023-12-14 14:02:11 +01:00
README.md	llama v1 request	2023-12-12 13:32:05 -08:00
requirements.txt	llama v2 with sharded weights	2023-12-12 12:48:15 -08:00
sample_prompt.txt	Add the Llama and Stable Diffusion examples	2023-11-29 10:38:20 -08:00

README.md

Llama

An example of generating text with Llama (1 or 2) using MLX.

Llama is a set of open source language models from Meta AI Research¹² ranging from 7B to 70B parameters.

Setup

Install the dependencies:

pip install -r requirements.txt

Next, download and convert the model. If you do not have access to the model weights you will need to request access from Meta:

Alternatively, you can also download a select converted checkpoints from the mlx-llama community organisation on Hugging Face and skip the conversion step.

Convert the weights with:

python convert.py --model_path <path_to_torch_model>

The conversion script will save the converted weights in the same location.

Run

Once you've converted the weights to MLX format, you can interact with the LlaMA model:

python llama.py <path_to_model> <path_to_tokenizer.model> "hello"

Run python llama.py --help for more details.

For Llama v1 refer to the arXiv paper and blog post for more details. ↩︎
For Llama v2 refer to the blob post ↩︎