mlx-examples/llms/mistral/README.md

# Mistral 

An example of generating text with Mistral using MLX.

Mistral 7B is one of the top large language models in its size class. It is
also fully open source with a permissive license[^1].

### Setup

Install the dependencies:

```
pip install -r requirements.txt
```

Next, download the model and tokenizer:

```
curl -O https://files.mistral-7b-v0-1.mistral.ai/mistral-7B-v0.1.tar
tar -xf mistral-7B-v0.1.tar
```

Then, convert the weights with:

```
python convert.py --torch-path <path_to_torch>
```

To generate a 4-bit quantized model, use ``-q``. For a full list of options:

```
python convert.py --help
```

By default, the conversion script will make the directory `mlx_model` and save
the converted `weights.npz`, `tokenizer.model`, and `config.json` there.

> [!TIP]
> Alternatively, you can also download a few converted checkpoints from the
> [MLX Community](https://huggingface.co/mlx-community) organization on Hugging
> Face and skip the conversion step.


### Run

Once you've converted the weights to MLX format, you can generate text with
the Mistral model:

```
python mistral.py --prompt "It is a truth universally acknowledged,"
```

Run `python mistral.py --help` for more details.

[^1]: Refer to the [blog post](https://mistral.ai/news/announcing-mistral-7b/)
and [github repository](https://github.com/mistralai/mistral-src) for more
details.
mistral 2023-12-06 03:02:52 +08:00			`# Mistral`

			`An example of generating text with Mistral using MLX.`

mixtral runs a bit faster 2023-12-13 00:36:40 +08:00			`Mistral 7B is one of the top large language models in its size class. It is`
			`also fully open source with a permissive license[^1].`
mistral 2023-12-06 03:02:52 +08:00
			`### Setup`

			`Install the dependencies:`

			```
			`pip install -r requirements.txt`
			```

nits 2023-12-06 03:24:30 +08:00			`Next, download the model and tokenizer:`
mistral 2023-12-06 03:02:52 +08:00
			```
			`curl -O https://files.mistral-7b-v0-1.mistral.ai/mistral-7B-v0.1.tar`
			`tar -xf mistral-7B-v0.1.tar`
			```

			`Then, convert the weights with:`

			```
Quantize example (#162) * testing quantization * conversion + quantization working * one config processor * quantization in mistral / nits in llama * args for quantization * llama / mistral conversion in good shape * phi2 quantized * mixtral * qwen conversion 2023-12-22 04:59:37 +08:00			`python convert.py --torch-path <path_to_torch>`
mistral 2023-12-06 03:02:52 +08:00			```

Quantize example (#162) * testing quantization * conversion + quantization working * one config processor * quantization in mistral / nits in llama * args for quantization * llama / mistral conversion in good shape * phi2 quantized * mixtral * qwen conversion 2023-12-22 04:59:37 +08:00			To generate a 4-bit quantized model, use ``-q``. For a full list of options:

			```
			`python convert.py --help`
			```

			By default, the conversion script will make the directory `mlx_model` and save
			the converted `weights.npz`, `tokenizer.model`, and `config.json` there.
mixtral runs a bit faster 2023-12-13 00:36:40 +08:00
Add URLs to HF MLX-Community org. (#153) * up * Add ref to MLX org on the README. * nit: language. * Standardise org name. 2023-12-20 22:57:13 +08:00			`> [!TIP]`
Add llms subdir + update README (#145) * add llms subdir + update README * nits * use same pre-commit as mlx * update readmes a bit * format 2023-12-21 02:22:25 +08:00			`> Alternatively, you can also download a few converted checkpoints from the`
			`> [MLX Community](https://huggingface.co/mlx-community) organization on Hugging`
			`> Face and skip the conversion step.`

Add URLs to HF MLX-Community org. (#153) * up * Add ref to MLX org on the README. * nit: language. * Standardise org name. 2023-12-20 22:57:13 +08:00
mistral 2023-12-06 03:02:52 +08:00			`### Run`

nits 2023-12-06 03:24:30 +08:00			`Once you've converted the weights to MLX format, you can generate text with`
			`the Mistral model:`
mistral 2023-12-06 03:02:52 +08:00
			```
Quantize example (#162) * testing quantization * conversion + quantization working * one config processor * quantization in mistral / nits in llama * args for quantization * llama / mistral conversion in good shape * phi2 quantized * mixtral * qwen conversion 2023-12-22 04:59:37 +08:00			`python mistral.py --prompt "It is a truth universally acknowledged,"`
mistral 2023-12-06 03:02:52 +08:00			```

			Run `python mistral.py --help` for more details.

mixtral runs a bit faster 2023-12-13 00:36:40 +08:00			`[^1]: Refer to the [blog post](https://mistral.ai/news/announcing-mistral-7b/)`
			`and [github repository](https://github.com/mistralai/mistral-src) for more`
			`details.`