mlx-examples/llama/README.md

# Llama

An example of generating text with Llama (1 or 2) using MLX.

Llama is a set of open source language models from Meta AI Research[^1][^2]
ranging from 7B to 70B parameters.

### Setup

Install the dependencies:

```
pip install -r requirements.txt
```

Next, download and convert the model. If you do not have access to the model
weights you will need to [request
access](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
from Meta.

Alternatively, you can also download a select converted checkpoints from the
[mlx-llama](https://huggingface.co/mlx-llama) community organisation on Hugging
Face and skip the conversion step.

Convert the weights with:

```
python convert.py --model_path <path_to_torch_model>
```

The conversion script will save the converted weights in the same location.

### Run

Once you've converted the weights to MLX format, you can interact with the
LlaMA model:

```
python llama.py <path_to_model> <path_to_tokenizer.model> "hello"
```

Run `python llama.py --help` for more details.

[^1]: For Llama v1 refer to the [arXiv paper](https://arxiv.org/abs/2302.13971) and [blog post](https://ai.meta.com/blog/large-language-model-llama-meta-ai/) for more details.
[^2]: For Llama v2 refer to the [blob post](https://ai.meta.com/llama/)
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`# Llama`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`An example of generating text with Llama (1 or 2) using MLX.`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`Llama is a set of open source language models from Meta AI Research[^1][^2]`
			`ranging from 7B to 70B parameters.`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00
			`### Setup`

			`Install the dependencies:`

			```
			`pip install -r requirements.txt`
			```

			`Next, download and convert the model. If you do not have access to the model`
			`weights you will need to [request`
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`access](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00			`from Meta.`

llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`Alternatively, you can also download a select converted checkpoints from the`
			`[mlx-llama](https://huggingface.co/mlx-llama) community organisation on Hugging`
			`Face and skip the conversion step.`
Add url to mlx checkpoints on HF. 2023-12-07 02:13:14 +08:00
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00			`Convert the weights with:`

			```
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`python convert.py --model_path <path_to_torch_model>`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00			```

llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`The conversion script will save the converted weights in the same location.`

Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00			`### Run`

			`Once you've converted the weights to MLX format, you can interact with the`
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`LlaMA model:`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00
			```
llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`python llama.py <path_to_model> <path_to_tokenizer.model> "hello"`
Add the Llama and Stable Diffusion examples 2023-11-30 02:38:20 +08:00			```

			Run `python llama.py --help` for more details.

llama v2 with sharded weights 2023-12-13 04:48:15 +08:00			`[^1]: For Llama v1 refer to the [arXiv paper](https://arxiv.org/abs/2302.13971) and [blog post](https://ai.meta.com/blog/large-language-model-llama-meta-ai/) for more details.`
			`[^2]: For Llama v2 refer to the [blob post](https://ai.meta.com/llama/)`