From 29e642a4826e57e15eea8b836927eaed76962da6 Mon Sep 17 00:00:00 2001 From: Awni Hannun Date: Mon, 18 Dec 2023 10:58:43 -0800 Subject: [PATCH] readme updates --- t5/README.md | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/t5/README.md b/t5/README.md index 5b173d42..89654a37 100644 --- a/t5/README.md +++ b/t5/README.md @@ -1,20 +1,33 @@ # T5 -[T5](https://arxiv.org/pdf/1910.10683.pdf) are encoder-decoder models pre-trained on a multi-task mixture of unsupervised and supervised tasks. T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task, e.g.: `translate English to German: …`, `summarize: ….` +The T5 models are encoder-decoder models pre-trained on a mixture of +unsupervised and supervised tasks.[^1] These models work well on a variety of +tasks by prepending task-specific prefixes to the input, e.g.: +`translate English to German: …`, `summarize: ….`, etc. ## Setup Download and convert the model: ```sh -python convert.py --model t5-small +python convert.py --model ``` -This will make the `{model}.npz` file which MLX can read. +This will make the `.npz` file which MLX can read. + +The `` can be any of the following: + +| Model Name | Model Size | +| ---------- | ---------- +| t5-small | 60 million | +| t5-base | 220 million | +| t5-large | 770 million | +| t5-3b | 3 billion | +| t5-11b | 11 billion | ## Generate -To run the model, use the `t5.py` script: +To gneerate text with the model, use the `t5.py` script: ```sh python t5.py --model t5-small --prompt "translate English to German: A tasty apple" @@ -27,3 +40,6 @@ To see a list of options run: ```sh python t5.py --help ``` + +[^1]: For more information on T5 see the [original paper](https://arxiv.org/abs/1910.10683) + or the [Hugging Face page](https://huggingface.co/docs/transformers/model_doc/t5).