mlx-examples/video/Wan2.1/README.md
2025-07-28 17:07:26 -07:00

3.1 KiB

Wan2.1

Quickstart

Installation

Install dependencies:

pip install -r requirements.txt

Model Download

Models Download Link Notes
T2V-14B 🤗 Huggingface 🤖 ModelScope Supports both 480P and 720P
I2V-14B-720P 🤗 Huggingface 🤖 ModelScope Supports 720P
I2V-14B-480P 🤗 Huggingface 🤖 ModelScope Supports 480P
T2V-1.3B 🤗 Huggingface 🤖 ModelScope Supports 480P

💡Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution. Also, note that the MLX port currently only supports T2V.

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B

Download models using modelscope-cli:

pip install modelscope
modelscope download Wan-AI/Wan2.1-T2V-14B --local_dir ./Wan2.1-T2V-14B

Run Text-to-Video Generation

This repository currently supports two Text-to-Video models (1.3B and 14B) and two resolutions (480P and 720P). The parameters and configurations for these models are as follows:

Task Resolution Model
480P 720P
t2v-14B ✔️ ✔️ Wan2.1-T2V-14B
t2v-1.3B ✔️ Wan2.1-T2V-1.3B
(1) Example:
python generate.py --task t2v-1.3B --size "480*832" --frame_num 16 --sample_steps 25 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --prompt "Lion running under snow in Samarkand" --save_file output_video_mlx.mp4

Citation

Credits to the Wan Team for the original PyTorch implementation.

@article{wan2.1,
    title   = {Wan: Open and Advanced Large-Scale Video Generative Models},
    author  = {Wan Team},
    journal = {},
    year    = {2025}
}