![]() * Add skeleton * Load all encoder weights * Pass config to all modules, fix ln * Load position bias embeddings * Load decoder weights * Move position biases to attention module * translate pytorch to mx * Fix default prompt * Fix relative_attention_max_distance config * No scaling, no encoder mask * LM head * Decode (broken after 1st token) * Use position bias in all layers * Utils to compare encoder output * Fix layer norm * Fix decoder mask * Use position bias in decoder * Concatenate tokens * Remove prints * Stop on eos * Measure tokens/s * with cache * bug fix with bidirectional only for encoder, add offset to position bias * format * Fix T5.__call__ * Stream output * Add argument to generate float16 npz * Load config from HF to support any model * Uncomment bidirectional param * Add gitignore * Add readme.md for t5 * Fix relative position scale * Fix --encode-only * Run hf_t5 with any model * Add hf generation for comparison * Fix type for attention mask * Increase hf max_length * Rescale output before projecting on vocab * readme updates * nits * Pass ln2 to cross attention * Fix example * Fix attention for 3b model * fp16, abstract tokenizer a bit, format * clamp for low precision * higher clipping, remove non-helpful casts * default to fp32 for now * Adds support for flan-t5 * Update t5 docs on variant support * readme flan * nit --------- Co-authored-by: Awni Hannun <awni@apple.com> |
||
---|---|---|
.. | ||
.gitignore | ||
convert.py | ||
hf_t5.py | ||
README.md | ||
requirements.txt | ||
t5.py |
T5
The T5 models are encoder-decoder models pre-trained on a mixture of
unsupervised and supervised tasks.1 These models work well on a variety of
tasks by prepending task-specific prefixes to the input, e.g.:
translate English to German: …
, summarize: ….
, etc.
This example also supports the FLAN-T5 models variants.2
Setup
Download and convert the model:
python convert.py --model <model>
This will make the <model>.npz
file which MLX can read.
The <model>
can be any of the following:
Model Name | Model Size |
---|---|
t5-small | 60 million |
t5-base | 220 million |
t5-large | 770 million |
t5-3b | 3 billion |
t5-11b | 11 billion |
The FLAN variants can be specified with google/flan-t5-small
,
google/flan-t5-base
, etc. See the Hugging Face
page for a
complete list of models.
Generate
Generate text with:
python t5.py --model t5-small --prompt "translate English to German: A tasty apple"
This should give the output: Ein leckerer Apfel
To see a list of options run:
python t5.py --help
-
For more information on T5 see the original paper or the Hugging Face page. ↩︎
-
For more information on FLAN-T5 see the original paper. ↩︎