mlx-examples/mixtral/README.md

## Mixtral 8x7B

Run the Mixtral[^mixtral] 8x7B mixture-of-experts (MoE) model in MLX on Apple silicon.

This example also supports the instruction fine-tuned Mixtral model.[^instruct]

Note, for 16-bit precision this model needs a machine with substantial RAM (~100GB) to run.

### Setup

Install [Git Large File
Storage](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage).
For example with Homebrew:

```
brew install git-lfs
```

Download the models from Hugging Face:

For the base model use:

```
export MIXTRAL_MODEL=Mixtral-8x7B-v0.1
```

For the instruction fine-tuned model use:

```
export MIXTRAL_MODEL=Mixtral-8x7B-Instruct-v0.1
```

Then run:

```
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/mistralai/${MIXTRAL_MODEL}/
cd $MIXTRAL_MODEL/ && \
  git lfs pull --include "consolidated.*.pt" && \
  git lfs pull --include "tokenizer.model"
```

Now from `mlx-exmaples/mixtral` convert and save the weights as NumPy arrays so
MLX can read them:

```
python convert.py --model_path $MIXTRAL_MODEL/
```

The conversion script will save the converted weights in the same location.

### Generate

As easy as:

```
python mixtral.py --model_path $MIXTRAL_MODEL/
```

For more options including how to prompt the model, run:

```
python mixtral.py --help
```

For the Instruction model, make sure to follow the prompt format:

```
[INST] Instruction prompt [/INST]
```

[^mixtral]: Refer to Mistral's [blog post](https://mistral.ai/news/mixtral-of-experts/) and the [Hugging Face blog post](https://huggingface.co/blog/mixtral) for more details.
[^instruc]: Refer to the [Hugging Face repo](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) for more
details
nit 2023-12-13 00:42:32 +08:00			`## Mixtral 8x7B`
initial mixtral 2023-12-12 23:44:23 +08:00
			`Run the Mixtral[^mixtral] 8x7B mixture-of-experts (MoE) model in MLX on Apple silicon.`

incude instruct option 2023-12-15 07:40:38 +08:00			`This example also supports the instruction fine-tuned Mixtral model.[^instruct]`

typos in readme 2023-12-13 00:41:28 +08:00			`Note, for 16-bit precision this model needs a machine with substantial RAM (~100GB) to run.`
initial mixtral 2023-12-12 23:44:23 +08:00
			`### Setup`

			`Install [Git Large File`
			`Storage](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage).`
			`For example with Homebrew:`

			```
			`brew install git-lfs`
			```

hf correction 2023-12-13 09:08:04 +08:00			`Download the models from Hugging Face:`
initial mixtral 2023-12-12 23:44:23 +08:00
incude instruct option 2023-12-15 07:40:38 +08:00			`For the base model use:`

			```
			`export MIXTRAL_MODEL=Mixtral-8x7B-v0.1`
			```

			`For the instruction fine-tuned model use:`

			```
			`export MIXTRAL_MODEL=Mixtral-8x7B-Instruct-v0.1`
			```

			`Then run:`

initial mixtral 2023-12-12 23:44:23 +08:00			```
incude instruct option 2023-12-15 07:40:38 +08:00			`GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/mistralai/${MIXTRAL_MODEL}/`
			`cd $MIXTRAL_MODEL/ && \`
use official HF for mixtral 2023-12-15 07:30:32 +08:00			`git lfs pull --include "consolidated.*.pt" && \`
			`git lfs pull --include "tokenizer.model"`
initial mixtral 2023-12-12 23:44:23 +08:00			```

Typo Fix 2023-12-13 04:15:50 +08:00			Now from `mlx-exmaples/mixtral` convert and save the weights as NumPy arrays so
typos in readme 2023-12-13 00:41:28 +08:00			`MLX can read them:`
initial mixtral 2023-12-12 23:44:23 +08:00
			```
incude instruct option 2023-12-15 07:40:38 +08:00			`python convert.py --model_path $MIXTRAL_MODEL/`
initial mixtral 2023-12-12 23:44:23 +08:00			```

mixtral runs a bit faster 2023-12-13 00:36:40 +08:00			`The conversion script will save the converted weights in the same location.`
initial mixtral 2023-12-12 23:44:23 +08:00
			`### Generate`

			`As easy as:`

			```
incude instruct option 2023-12-15 07:40:38 +08:00			`python mixtral.py --model_path $MIXTRAL_MODEL/`
			```

			`For more options including how to prompt the model, run:`

			```
			`python mixtral.py --help`
initial mixtral 2023-12-12 23:44:23 +08:00			```

format 2023-12-15 08:56:50 +08:00			`For the Instruction model, make sure to follow the prompt format:`

			```
			`[INST] Instruction prompt [/INST]`
			```

			`[^mixtral]: Refer to Mistral's [blog post](https://mistral.ai/news/mixtral-of-experts/) and the [Hugging Face blog post](https://huggingface.co/blog/mixtral) for more details.`
incude instruct option 2023-12-15 07:40:38 +08:00			`[^instruc]: Refer to the [Hugging Face repo](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) for more`
			`details`