mlx-examples/whisper/mlx_whisper/load_models.py

# Copyright © 2023 Apple Inc.

import json
from pathlib import Path

import mlx.core as mx
import mlx.nn as nn
from huggingface_hub import snapshot_download
from mlx.utils import tree_unflatten

from . import whisper


def load_model(
    path_or_hf_repo: str,
    dtype: mx.Dtype = mx.float32,
) -> whisper.Whisper:
    model_path = Path(path_or_hf_repo)
    if not model_path.exists():
        model_path = Path(snapshot_download(repo_id=path_or_hf_repo))

    with open(str(model_path / "config.json"), "r") as f:
        config = json.loads(f.read())
        config.pop("model_type", None)
        quantization = config.pop("quantization", None)

    model_args = whisper.ModelDimensions(**config)

    wf = model_path / "weights.safetensors"
    if not wf.exists():
        wf = model_path / "weights.npz"
    weights = mx.load(str(wf))

    model = whisper.Whisper(model_args, dtype)

    if quantization is not None:
        class_predicate = (
            lambda p, m: isinstance(m, (nn.Linear, nn.Embedding))
            and f"{p}.scales" in weights
        )
        nn.quantize(model, **quantization, class_predicate=class_predicate)

    weights = tree_unflatten(list(weights.items()))
    model.update(weights)
    mx.eval(model.parameters())
    return model
add copyright in source 2023-12-01 03:08:53 +08:00			`# Copyright © 2023 Apple Inc.`

[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`import json`
			`from pathlib import Path`
a few examples 2023-11-30 00:17:26 +08:00
			`import mlx.core as mx`
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`import mlx.nn as nn`
[Whisper] Add load from Hub. (#255) * Add load from Hub. * Up. 2024-01-08 22:20:00 +08:00			`from huggingface_hub import snapshot_download`
Added lora support for Phi-2 (#302) * Added lora support for Phi-2 * Added Phi-2 support in fuse and convert * format + readme --------- Co-authored-by: Awni Hannun <awni@apple.com> 2024-01-13 05:45:30 +08:00			`from mlx.utils import tree_unflatten`
[Whisper] Add load from Hub. (#255) * Add load from Hub. * Up. 2024-01-08 22:20:00 +08:00
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`from . import whisper`
a few examples 2023-11-30 00:17:26 +08:00

[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`def load_model(`
[Whisper] Add load from Hub. (#255) * Add load from Hub. * Up. 2024-01-08 22:20:00 +08:00			`path_or_hf_repo: str,`
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`dtype: mx.Dtype = mx.float32,`
			`) -> whisper.Whisper:`
[Whisper] Add load from Hub. (#255) * Add load from Hub. * Up. 2024-01-08 22:20:00 +08:00			`model_path = Path(path_or_hf_repo)`
			`if not model_path.exists():`
Added lora support for Phi-2 (#302) * Added lora support for Phi-2 * Added Phi-2 support in fuse and convert * format + readme --------- Co-authored-by: Awni Hannun <awni@apple.com> 2024-01-13 05:45:30 +08:00			`model_path = Path(snapshot_download(repo_id=path_or_hf_repo))`
a few examples 2023-11-30 00:17:26 +08:00
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`with open(str(model_path / "config.json"), "r") as f:`
			`config = json.loads(f.read())`
			`config.pop("model_type", None)`
			`quantization = config.pop("quantization", None)`
a few examples 2023-11-30 00:17:26 +08:00
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`model_args = whisper.ModelDimensions(**config)`
a few examples 2023-11-30 00:17:26 +08:00
Whisper improvements (#1080) * use safetensors in whisper * speed up decoder * version 2024-11-02 01:52:28 +08:00			`wf = model_path / "weights.safetensors"`
			`if not wf.exists():`
			`wf = model_path / "weights.npz"`
			`weights = mx.load(str(wf))`
a few examples 2023-11-30 00:17:26 +08:00
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`model = whisper.Whisper(model_args, dtype)`
a few examples 2023-11-30 00:17:26 +08:00
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`if quantization is not None:`
Fix quant in gguf (#698) * fix quant in gguf * fix whisper 2024-04-20 11:07:11 +08:00			`class_predicate = (`
			`lambda p, m: isinstance(m, (nn.Linear, nn.Embedding))`
			`and f"{p}.scales" in weights`
			`)`
			`nn.quantize(model, **quantization, class_predicate=class_predicate)`
a few examples 2023-11-30 00:17:26 +08:00
Fix quant in gguf (#698) * fix quant in gguf * fix whisper 2024-04-20 11:07:11 +08:00			`weights = tree_unflatten(list(weights.items()))`
[Whisper] Load customized MLX model & Quantization (#191) * Add option to load customized mlx model * Add quantization * Apply reviews * Separate model conversion and loading * Update test * Fix benchmark * Add notes about conversion * Improve doc 2023-12-30 02:22:15 +08:00			`model.update(weights)`
			`mx.eval(model.parameters())`
a few examples 2023-11-30 00:17:26 +08:00			`return model`