mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-07-26 14:24:11 +08:00
45 lines
1.1 KiB
Markdown
45 lines
1.1 KiB
Markdown
## Mixtral 8x7B
|
|
|
|
Run the Mixtral[^mixtral] 8x7B mixture-of-experts (MoE) model in MLX on Apple silicon.
|
|
|
|
Note, for 16-bit precision this model needs a machine with substantial RAM (~100GB) to run.
|
|
|
|
### Setup
|
|
|
|
Install [Git Large File
|
|
Storage](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage).
|
|
For example with Homebrew:
|
|
|
|
```
|
|
brew install git-lfs
|
|
```
|
|
|
|
Download the models from Hugging Face:
|
|
|
|
```
|
|
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/
|
|
cd Mixtral-8x7B-v0.1/ && \
|
|
git lfs pull --include "consolidated.*.pt" && \
|
|
git lfs pull --include "tokenizer.model"
|
|
```
|
|
|
|
Now from `mlx-exmaples/mixtral` convert and save the weights as NumPy arrays so
|
|
MLX can read them:
|
|
|
|
```
|
|
python convert.py --model_path Mixtral-8x7B-v0.1/
|
|
```
|
|
|
|
The conversion script will save the converted weights in the same location.
|
|
|
|
### Generate
|
|
|
|
As easy as:
|
|
|
|
```
|
|
python mixtral.py --model_path Mixtral-8x7B-v0.1/
|
|
```
|
|
|
|
[^mixtral]: Refer to Mistral's [blog
|
|
post](https://mistral.ai/news/mixtral-of-experts/) for more details.
|