mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-08-09 02:16:37 +08:00

History

Juarez Bochi 10a7b99e83 Add T5 and Flan-T5 example (#113 ) * Add skeleton * Load all encoder weights * Pass config to all modules, fix ln * Load position bias embeddings * Load decoder weights * Move position biases to attention module * translate pytorch to mx * Fix default prompt * Fix relative_attention_max_distance config * No scaling, no encoder mask * LM head * Decode (broken after 1st token) * Use position bias in all layers * Utils to compare encoder output * Fix layer norm * Fix decoder mask * Use position bias in decoder * Concatenate tokens * Remove prints * Stop on eos * Measure tokens/s * with cache * bug fix with bidirectional only for encoder, add offset to position bias * format * Fix T5.__call__ * Stream output * Add argument to generate float16 npz * Load config from HF to support any model * Uncomment bidirectional param * Add gitignore * Add readme.md for t5 * Fix relative position scale * Fix --encode-only * Run hf_t5 with any model * Add hf generation for comparison * Fix type for attention mask * Increase hf max_length * Rescale output before projecting on vocab * readme updates * nits * Pass ln2 to cross attention * Fix example * Fix attention for 3b model * fp16, abstract tokenizer a bit, format * clamp for low precision * higher clipping, remove non-helpful casts * default to fp32 for now * Adds support for flan-t5 * Update t5 docs on variant support * readme flan * nit --------- Co-authored-by: Awni Hannun <awni@apple.com>		2023-12-18 20:25:34 -08:00
..
.gitignore	Add T5 and Flan-T5 example (#113 )	2023-12-18 20:25:34 -08:00
convert.py	Add T5 and Flan-T5 example (#113 )	2023-12-18 20:25:34 -08:00
hf_t5.py	Add T5 and Flan-T5 example (#113 )	2023-12-18 20:25:34 -08:00
README.md	Add T5 and Flan-T5 example (#113 )	2023-12-18 20:25:34 -08:00
requirements.txt	Add T5 and Flan-T5 example (#113 )	2023-12-18 20:25:34 -08:00
t5.py	Add T5 and Flan-T5 example (#113 )	2023-12-18 20:25:34 -08:00

README.md

T5

The T5 models are encoder-decoder models pre-trained on a mixture of unsupervised and supervised tasks.¹ These models work well on a variety of tasks by prepending task-specific prefixes to the input, e.g.: translate English to German: …, summarize: …., etc.

This example also supports the FLAN-T5 models variants.²

Setup

Download and convert the model:

python convert.py --model <model>

This will make the <model>.npz file which MLX can read.

The <model> can be any of the following:

Model Name	Model Size
t5-small	60 million
t5-base	220 million
t5-large	770 million
t5-3b	3 billion
t5-11b	11 billion

The FLAN variants can be specified with google/flan-t5-small, google/flan-t5-base, etc. See the Hugging Face page for a complete list of models.

Generate

Generate text with:

python t5.py --model t5-small --prompt "translate English to German: A tasty apple"

This should give the output: Ein leckerer Apfel

To see a list of options run:

python t5.py --help

For more information on T5 see the original paper or the Hugging Face page. ↩︎
For more information on FLAN-T5 see the original paper. ↩︎