support for tiny llama (#129)

This commit is contained in:
Awni Hannun
2023-12-18 07:47:55 -08:00
committed by GitHub
parent 08e862336a
commit 44b546d446
3 changed files with 140 additions and 45 deletions

View File

@@ -3,8 +3,8 @@
An example of generating text with Llama (1 or 2) using MLX.
Llama is a set of open source language models from Meta AI Research[^1][^2]
ranging from 7B to 70B parameters. This example also supports Llama Chat and
Code Llama.
ranging from 7B to 70B parameters. This example also supports Meta's Llama Chat
and Code Llama models, as well as the 1.1B TinyLlama models from SUTD.[^3]
### Setup
@@ -25,10 +25,19 @@ Alternatively, you can also download a select converted checkpoints from the
[mlx-llama](https://huggingface.co/mlx-llama) community organisation on Hugging
Face and skip the conversion step.
You can download the TinyLlama models directly from [Hugging
Face](https://huggingface.co/TinyLlama).
Convert the weights with:
```
python convert.py --model_path <path_to_torch_model>
python convert.py --model-path <path_to_torch_model>
```
For TinyLlama use
```
python convert.py --model-path <path_to_torch_model> --model-name tiny_llama
```
The conversion script will save the converted weights in the same location.
@@ -39,10 +48,11 @@ Once you've converted the weights to MLX format, you can interact with the
LlaMA model:
```
python llama.py <path_to_model> <path_to_tokenizer.model> "hello"
python llama.py <path_to_model> <path_to_tokenizer.model> --prompt "hello"
```
Run `python llama.py --help` for more details.
[^1]: For Llama v1 refer to the [arXiv paper](https://arxiv.org/abs/2302.13971) and [blog post](https://ai.meta.com/blog/large-language-model-llama-meta-ai/) for more details.
[^2]: For Llama v2 refer to the [blob post](https://ai.meta.com/llama/)
[^3]: For TinyLlama refer to the [gihub repository](https://github.com/jzhang38/TinyLlama?tab=readme-ov-file)