From c3931495604f9de8499d6e9ace0ffdbc95eddf58 Mon Sep 17 00:00:00 2001 From: Awni Hannun Date: Mon, 18 Dec 2023 20:18:42 -0800 Subject: [PATCH] readme flan --- t5/README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/t5/README.md b/t5/README.md index 289c7010..3ff76c57 100644 --- a/t5/README.md +++ b/t5/README.md @@ -5,6 +5,8 @@ unsupervised and supervised tasks.[^1] These models work well on a variety of tasks by prepending task-specific prefixes to the input, e.g.: `translate English to German: …`, `summarize: ….`, etc. +This example also supports the FLAN-T5 models variants.[^2] + ## Setup Download and convert the model: @@ -25,7 +27,10 @@ The `` can be any of the following: | t5-3b | 3 billion | | t5-11b | 11 billion | -It also supports t5 variants, such as `google/flan-t5-small`, `google/flan-t5-base`, etc. +The FLAN variants can be specified with the e.g. `google/flan-t5-small`, +`google/flan-t5-base`, etc. See the [Hugging Face +page](https://huggingface.co/docs/transformers/model_doc/flan-t5) for a +complete list of models. ## Generate @@ -45,3 +50,4 @@ python t5.py --help [^1]: For more information on T5 see the [original paper](https://arxiv.org/abs/1910.10683) or the [Hugging Face page](https://huggingface.co/docs/transformers/model_doc/t5). +[^2]: For more information on FLAN-T5 see the [original paper](https://arxiv.org/abs/2210.11416).