mlx-examples/mixtral/README.md

53 lines
1.4 KiB
Markdown
Raw Normal View History

2023-12-13 00:42:32 +08:00
## Mixtral 8x7B
2023-12-12 23:44:23 +08:00
Run the Mixtral[^mixtral] 8x7B mixture-of-experts (MoE) model in MLX on Apple silicon.
2023-12-13 00:41:28 +08:00
Note, for 16-bit precision this model needs a machine with substantial RAM (~100GB) to run.
2023-12-12 23:44:23 +08:00
### Setup
Install [Git Large File
Storage](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage).
For example with Homebrew:
```
brew install git-lfs
```
2023-12-13 09:08:04 +08:00
Download the models from Hugging Face:
2023-12-12 23:44:23 +08:00
```
git-lfs clone https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen
2023-12-12 23:44:23 +08:00
```
After that's done, combine the files:
```
cd mixtral-8x7b-32kseqlen/
cat consolidated.00.pth-split0 consolidated.00.pth-split1 consolidated.00.pth-split2 consolidated.00.pth-split3 consolidated.00.pth-split4 consolidated.00.pth-split5 consolidated.00.pth-split6 consolidated.00.pth-split7 consolidated.00.pth-split8 consolidated.00.pth-split9 consolidated.00.pth-split10 > consolidated.00.pth
```
2023-12-13 04:15:50 +08:00
Now from `mlx-exmaples/mixtral` convert and save the weights as NumPy arrays so
2023-12-13 00:41:28 +08:00
MLX can read them:
2023-12-12 23:44:23 +08:00
```
python convert.py --model_path mixtral-8x7b-32kseqlen/
```
2023-12-13 00:36:40 +08:00
The conversion script will save the converted weights in the same location.
2023-12-12 23:44:23 +08:00
After that's done, if you want to clean some stuff up:
```
2023-12-13 00:36:40 +08:00
rm mixtral-8x7b-32kseqlen/*.pth*
2023-12-12 23:44:23 +08:00
```
### Generate
As easy as:
```
2023-12-13 04:48:15 +08:00
python mixtral.py --model_path mixtral-8x7b-32kseqlen/
2023-12-12 23:44:23 +08:00
```
2023-12-13 00:41:28 +08:00
[^mixtral]: Refer to Mistral's [blog post](https://mistral.ai/news/mixtral-of-experts/) for more details.