mlx-examples

zhangyiss/mlx-examples

Fork 0

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-12 07:48:55 +08:00

Files

History

Anchen 6217d7acd0 Delete llms/hf_llm/models/.gitignore (#300 )

2024-01-11 16:56:50 -08:00

models

Delete llms/hf_llm/models/.gitignore (#300 )

2024-01-11 16:56:50 -08:00

.gitignore

refactor(hf_llm): moving phi2 example into hf_llm (#293 )

2024-01-11 12:29:12 -08:00

convert.py

refactor(hf_llm): moving phi2 example into hf_llm (#293 )

2024-01-11 12:29:12 -08:00

generate.py

refactor(hf_llm): moving phi2 example into hf_llm (#293 )

2024-01-11 12:29:12 -08:00

models.py

Move lora example to use the same model format / conversion as hf_llm (#252 )

2024-01-09 11:14:52 -08:00

README.md

refactor(hf_llm): moving phi2 example into hf_llm (#293 )

2024-01-11 12:29:12 -08:00

requirements.txt

refactor: merge deepseek coder example into hf_llm example (#234 )

2024-01-06 07:53:46 -08:00

utils.py

refactor(hf_llm): moving phi2 example into hf_llm (#293 )

2024-01-11 12:29:12 -08:00

README.md

Generate Text with MLX and 🤗 Hugging Face

This an example of large language model text generation that can pull models from the Hugging Face Hub.

Setup

Install the dependencies:

pip install -r requirements.txt

Run

python generate.py --model <model_path> --prompt "hello"

For example:

python generate.py --model mistralai/Mistral-7B-v0.1 --prompt "hello"

will download the Mistral 7B model and generate text using the given prompt.

The <model_path> should be either a path to a local directory or a Hugging Face repo with weights stored in safetensors format. If you use a repo from the Hugging Face Hub, then the model will be downloaded and cached the first time you run it. See the Models section for a full list of supported models.

Run python generate.py --help to see all the options.

Models

The example supports Hugging Face format Mistral, Llama, and Phi-2 style models. If the model you want to run is not supported, file an issue or better yet, submit a pull request.

Here are a few examples of Hugging Face models that work with this example:

Most Mistral, Llama, and Phi-2 style models should work out of the box.

Convert new models

You can convert (change the data type or quantize) models using the convert.py script. This script takes a Hugging Face repo as input and outputs a model directory (which you can optionally also upload to Hugging Face).

For example, to make a 4-bit quantized model, run:

python convert.py --hf-path <hf_repo> -q

For more options run:

python convert.py --help

You can upload new models to Hugging Face by specifying --upload-repo to convert.py. For example, to upload a quantized Mistral-7B model to the MLX Hugging Face community you can do:

python convert.py \
    --hf-path mistralai/Mistral-7B-v0.1 \
    -q \
    --upload mlx-community/my-4bit-mistral \