llama v2 with sharded weights

2025-09-01 12:49:50 +08:00 · 2023-12-12 12:48:15 -08:00
parent 03a408fa2e
commit c18b990f08
5 changed files with 189 additions and 123 deletions
--- a/llama/README.md
+++ b/llama/README.md
@@ -1,8 +1,9 @@
-# LLaMA
+# Llama

-An example of generating text with LLaMA using MLX.
+An example of generating text with Llama (1 or 2) using MLX.

-LLaMA is a set of open source language models from Meta AI Research[^1] ranging from 7B to 65B parameters.
+Llama is a set of open source language models from Meta AI Research[^1][^2]
+ranging from 7B to 70B parameters.

 ### Setup

@@ -14,27 +15,31 @@ pip install -r requirements.txt

 Next, download and convert the model. If you do not have access to the model
 weights you will need to [request
-access](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform)
+access](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
 from Meta.

-
-Alternatively, you can also download a select converted checkpoints from the [mlx-llama](https://huggingface.co/mlx-llama) community organisation on Hugging Face and skip the conversion step.
+Alternatively, you can also download a select converted checkpoints from the
+[mlx-llama](https://huggingface.co/mlx-llama) community organisation on Hugging
+Face and skip the conversion step.

 Convert the weights with:

 ```
-python convert.py <path_to_torch_weights> <path_to_mlx_llama_weights.npz>
+python convert.py --model_path <path_to_torch_model>
 ```

+The conversion script will save the converted weights in the same location.
+
 ### Run

 Once you've converted the weights to MLX format, you can interact with the
-LLaMA model:
+LlaMA model:

 ```
-python llama.py <path_to_mlx_llama_weights.npz> <path_to_tokenizer.model> "hello"
+python llama.py <path_to_model> <path_to_tokenizer.model> "hello"
 ```

 Run `python llama.py --help` for more details.

-[^1]: Refer to the [arXiv paper](https://arxiv.org/abs/2302.13971) and [blog post](https://ai.meta.com/blog/large-language-model-llama-meta-ai/) for more details.
+[^1]: For Llama v1 refer to the [arXiv paper](https://arxiv.org/abs/2302.13971) and [blog post](https://ai.meta.com/blog/large-language-model-llama-meta-ai/) for more details.
+[^2]: For Llama v2 refer to the [blob post](https://ai.meta.com/llama/)