few more nits

This commit is contained in:
Awni Hannun 2023-12-09 14:20:19 -08:00
parent 98f4346c81
commit 036090f508
2 changed files with 4 additions and 5 deletions

View File

@ -27,10 +27,10 @@ If you do not have access to the Llama weights you will need to [request
access](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform) access](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform)
from Meta. from Meta.
Convert the weights with: Convert the model with:
``` ```
python convert.py <path_to_torch_weights> <path_to_mlx_weights.npz> python convert.py <path_to_torch_model> <path_to_mlx_model>
``` ```
## Run ## Run
@ -52,7 +52,7 @@ python lora.py --model <path_to_model> \
``` ```
Note, the model path should have the MLX weights, the tokenizer, and the Note, the model path should have the MLX weights, the tokenizer, and the
`params.json` configuration which will all be output by the `conver.py` script. `params.json` configuration which will all be output by the `convert.py` script.
By default, the adapter weights are saved in `adapters.npz`. You can specify By default, the adapter weights are saved in `adapters.npz`. You can specify
the output location with `--adapter_file`. the output location with `--adapter_file`.
@ -96,8 +96,6 @@ training and validation loss at a few points over the course of training.
| 800 | 1.017 | 1.255 | | 800 | 1.017 | 1.255 |
| 1000 | 1.070 | 1.230 | | 1000 | 1.070 | 1.230 |
After training for 1000 iterations, the validation perplexity reduces to XX.
The model trains at around 475 tokens per second on an M2 Ultra. The model trains at around 475 tokens per second on an M2 Ultra.
[^lora]: Refer to the [arXiv paper](https://arxiv.org/abs/2106.09685) for more details on LoRA. [^lora]: Refer to the [arXiv paper](https://arxiv.org/abs/2106.09685) for more details on LoRA.

View File

@ -1,2 +1,3 @@
mlx
sentencepiece sentencepiece
torch torch