mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-09-01 12:49:50 +08:00
[Whisper] Load customized MLX model & Quantization (#191)
* Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc
This commit is contained in:
@@ -6,7 +6,7 @@ parameters[^1].
|
||||
|
||||
### Setup
|
||||
|
||||
First, install the dependencies.
|
||||
First, install the dependencies:
|
||||
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
@@ -19,6 +19,28 @@ Install [`ffmpeg`](https://ffmpeg.org/):
|
||||
brew install ffmpeg
|
||||
```
|
||||
|
||||
Next, download the Whisper PyTorch checkpoint and convert the weights to the MLX format. For example, to convert the `tiny` model use:
|
||||
|
||||
```
|
||||
python convert.py --torch-name-or-path tiny --mlx-path mlx_models/tiny
|
||||
```
|
||||
|
||||
Note you can also convert a local PyTorch checkpoint which is in the original OpenAI format.
|
||||
|
||||
To generate a 4-bit quantized model, use `-q`. For a full list of options:
|
||||
|
||||
```
|
||||
python convert.py --help
|
||||
```
|
||||
|
||||
By default, the conversion script will make the directory `mlx_models/tiny` and save
|
||||
the converted `weights.npz` and `config.json` there.
|
||||
|
||||
> [!TIP]
|
||||
> Alternatively, you can also download a few converted checkpoints from the
|
||||
> [MLX Community](https://huggingface.co/mlx-community) organization on Hugging
|
||||
> Face and skip the conversion step.
|
||||
|
||||
### Run
|
||||
|
||||
Transcribe audio with:
|
||||
|
Reference in New Issue
Block a user