readme updates

This commit is contained in:
Awni Hannun 2023-12-18 10:58:43 -08:00
parent 36fd88509e
commit 29e642a482

View File

@ -1,20 +1,33 @@
# T5 # T5
[T5](https://arxiv.org/pdf/1910.10683.pdf) are encoder-decoder models pre-trained on a multi-task mixture of unsupervised and supervised tasks. T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task, e.g.: `translate English to German: …`, `summarize: ….` The T5 models are encoder-decoder models pre-trained on a mixture of
unsupervised and supervised tasks.[^1] These models work well on a variety of
tasks by prepending task-specific prefixes to the input, e.g.:
`translate English to German: …`, `summarize: ….`, etc.
## Setup ## Setup
Download and convert the model: Download and convert the model:
```sh ```sh
python convert.py --model t5-small python convert.py --model <model>
``` ```
This will make the `{model}.npz` file which MLX can read. This will make the `<model>.npz` file which MLX can read.
The `<model>` can be any of the following:
| Model Name | Model Size |
| ---------- | ----------
| t5-small | 60 million |
| t5-base | 220 million |
| t5-large | 770 million |
| t5-3b | 3 billion |
| t5-11b | 11 billion |
## Generate ## Generate
To run the model, use the `t5.py` script: To gneerate text with the model, use the `t5.py` script:
```sh ```sh
python t5.py --model t5-small --prompt "translate English to German: A tasty apple" python t5.py --model t5-small --prompt "translate English to German: A tasty apple"
@ -27,3 +40,6 @@ To see a list of options run:
```sh ```sh
python t5.py --help python t5.py --help
``` ```
[^1]: For more information on T5 see the [original paper](https://arxiv.org/abs/1910.10683)
or the [Hugging Face page](https://huggingface.co/docs/transformers/model_doc/t5).