mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-08-21 12:06:51 +08:00
84 lines
3.1 KiB
Markdown
84 lines
3.1 KiB
Markdown
# Wan2.1
|
|
|
|
## Quickstart
|
|
|
|
#### Installation
|
|
Install dependencies:
|
|
```
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
|
|
#### Model Download
|
|
|
|
| Models | Download Link | Notes |
|
|
| --------------|-------------------------------------------------------------------------------|-------------------------------|
|
|
| T2V-14B | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-14B) | Supports both 480P and 720P
|
|
| I2V-14B-720P | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P) | Supports 720P
|
|
| I2V-14B-480P | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480P) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-480P) | Supports 480P
|
|
| T2V-1.3B | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B) | Supports 480P
|
|
|
|
> 💡Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution. Also, note that the MLX port currently only supports T2V.
|
|
|
|
Download models using huggingface-cli:
|
|
```
|
|
pip install "huggingface_hub[cli]"
|
|
huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B
|
|
```
|
|
|
|
Download models using modelscope-cli:
|
|
```
|
|
pip install modelscope
|
|
modelscope download Wan-AI/Wan2.1-T2V-14B --local_dir ./Wan2.1-T2V-14B
|
|
```
|
|
#### Run Text-to-Video Generation
|
|
|
|
This repository currently supports two Text-to-Video models (1.3B and 14B) and two resolutions (480P and 720P). The parameters and configurations for these models are as follows:
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th rowspan="2">Task</th>
|
|
<th colspan="2">Resolution</th>
|
|
<th rowspan="2">Model</th>
|
|
</tr>
|
|
<tr>
|
|
<th>480P</th>
|
|
<th>720P</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>t2v-14B</td>
|
|
<td style="color: green;">✔️</td>
|
|
<td style="color: green;">✔️</td>
|
|
<td>Wan2.1-T2V-14B</td>
|
|
</tr>
|
|
<tr>
|
|
<td>t2v-1.3B</td>
|
|
<td style="color: green;">✔️</td>
|
|
<td style="color: red;">❌</td>
|
|
<td>Wan2.1-T2V-1.3B</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
|
|
##### (1) Example:
|
|
```
|
|
python generate.py --task t2v-1.3B --size "480*832" --frame_num 16 --sample_steps 25 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --prompt "Lion running under snow in Samarkand" --save_file output_video_mlx.mp4
|
|
```
|
|
|
|
|
|
## Citation
|
|
Credits to the Wan Team for the original PyTorch implementation.
|
|
|
|
```
|
|
@article{wan2.1,
|
|
title = {Wan: Open and Advanced Large-Scale Video Generative Models},
|
|
author = {Wan Team},
|
|
journal = {},
|
|
year = {2025}
|
|
}
|
|
```
|