mlx-examples/llms/mlx_lm/__init__.py
Cavit Erginsoy 7ee76a32a4 Add memory estimation tool for MLX language models
This commit introduces a comprehensive memory estimation utility for MLX language models, supporting:
- Dynamic parameter calculation across diverse model architectures
- Handling of quantized and standard models
- Estimation of model weights, KV cache, and overhead memory
- Support for bounded and unbounded KV cache modes
- Flexible configuration via command-line arguments

The new tool provides detailed memory usage insights for different model configurations and generation scenarios.
2025-03-10 03:03:01 +00:00

16 lines
295 B
Python

# Copyright © 2023-2024 Apple Inc.
import os
from ._version import __version__
os.environ["TRANSFORMERS_NO_ADVISORY_WARNINGS"] = "1"
from .utils import convert, generate, load, stream_generate
def get_estimate_mem():
from .estimate_memory import estimate_mem
return estimate_mem