mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-07-25 13:51:15 +08:00
1.1 KiB
1.1 KiB
Mixtral 8x7B
Run the Mixtral1 8x7B mixture-of-experts (MoE) model in MLX on Apple silicon.
Note, for 16-bit precision this model needs a machine with substantial RAM (~100GB) to run.
Setup
Install Git Large File Storage. For example with Homebrew:
brew install git-lfs
Download the models from Hugging Face:
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/
cd Mixtral-8x7B-v0.1/ && \
git lfs pull --include "consolidated.*.pt" && \
git lfs pull --include "tokenizer.model"
Now from mlx-exmaples/mixtral
convert and save the weights as NumPy arrays so
MLX can read them:
python convert.py --model_path Mixtral-8x7B-v0.1/
The conversion script will save the converted weights in the same location.
Generate
As easy as:
python mixtral.py --model_path Mixtral-8x7B-v0.1/