mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-06-24 09:21:18 +08:00

* add llms subdir + update README

* nits

* use same pre-commit as mlx

* update readmes a bit

* format

2023-12-20 10:22:25 -08:00

1.8 KiB

Raw Blame History

Llama

An example of generating text with Llama (1 or 2) using MLX.

Llama is a set of open source language models from Meta AI Research¹² ranging from 7B to 70B parameters. This example also supports Meta's Llama Chat and Code Llama models, as well as the 1.1B TinyLlama models from SUTD.³

Setup

Install the dependencies:

pip install -r requirements.txt

Next, download and convert the model. If you do not have access to the model weights you will need to request access from Meta:

[!TIP] Alternatively, you can also download a few converted checkpoints from the MLX Community organization on Hugging Face and skip the conversion step.

You can download the TinyLlama models directly from Hugging Face.

Convert the weights with:

python convert.py --model-path <path_to_torch_model>

For TinyLlama use

python convert.py --model-path <path_to_torch_model> --model-name tiny_llama

The conversion script will save the converted weights in the same location.

Run

Once you've converted the weights to MLX format, you can interact with the LlaMA model:

python llama.py <path_to_model> <path_to_tokenizer.model> --prompt "hello"

Run python llama.py --help for more details.

For Llama v1 refer to the arXiv paper and blog post for more details. ↩︎
For Llama v2 refer to the blob post ↩︎
For TinyLlama refer to the gihub repository ↩︎

1.8 KiB Raw Blame History

Llama

Setup

Run

1.8 KiB

Raw Blame History