mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-06-24 17:31:18 +08:00

* Add skeleton * Load all encoder weights * Pass config to all modules, fix ln * Load position bias embeddings * Load decoder weights * Move position biases to attention module * translate pytorch to mx * Fix default prompt * Fix relative_attention_max_distance config * No scaling, no encoder mask * LM head * Decode (broken after 1st token) * Use position bias in all layers * Utils to compare encoder output * Fix layer norm * Fix decoder mask * Use position bias in decoder * Concatenate tokens * Remove prints * Stop on eos * Measure tokens/s * with cache * bug fix with bidirectional only for encoder, add offset to position bias * format * Fix T5.__call__ * Stream output * Add argument to generate float16 npz * Load config from HF to support any model * Uncomment bidirectional param * Add gitignore * Add readme.md for t5 * Fix relative position scale * Fix --encode-only * Run hf_t5 with any model * Add hf generation for comparison * Fix type for attention mask * Increase hf max_length * Rescale output before projecting on vocab * readme updates * nits * Pass ln2 to cross attention * Fix example * Fix attention for 3b model * fp16, abstract tokenizer a bit, format * clamp for low precision * higher clipping, remove non-helpful casts * default to fp32 for now * Adds support for flan-t5 * Update t5 docs on variant support * readme flan * nit --------- Co-authored-by: Awni Hannun <awni@apple.com>
54 lines
1.4 KiB
Markdown
54 lines
1.4 KiB
Markdown
# T5
|
|
|
|
The T5 models are encoder-decoder models pre-trained on a mixture of
|
|
unsupervised and supervised tasks.[^1] These models work well on a variety of
|
|
tasks by prepending task-specific prefixes to the input, e.g.:
|
|
`translate English to German: …`, `summarize: ….`, etc.
|
|
|
|
This example also supports the FLAN-T5 models variants.[^2]
|
|
|
|
## Setup
|
|
|
|
Download and convert the model:
|
|
|
|
```sh
|
|
python convert.py --model <model>
|
|
```
|
|
|
|
This will make the `<model>.npz` file which MLX can read.
|
|
|
|
The `<model>` can be any of the following:
|
|
|
|
| Model Name | Model Size |
|
|
| ---------- | ----------
|
|
| t5-small | 60 million |
|
|
| t5-base | 220 million |
|
|
| t5-large | 770 million |
|
|
| t5-3b | 3 billion |
|
|
| t5-11b | 11 billion |
|
|
|
|
The FLAN variants can be specified with `google/flan-t5-small`,
|
|
`google/flan-t5-base`, etc. See the [Hugging Face
|
|
page](https://huggingface.co/docs/transformers/model_doc/flan-t5) for a
|
|
complete list of models.
|
|
|
|
## Generate
|
|
|
|
Generate text with:
|
|
|
|
```sh
|
|
python t5.py --model t5-small --prompt "translate English to German: A tasty apple"
|
|
```
|
|
|
|
This should give the output: `Ein leckerer Apfel`
|
|
|
|
To see a list of options run:
|
|
|
|
```sh
|
|
python t5.py --help
|
|
```
|
|
|
|
[^1]: For more information on T5 see the [original paper](https://arxiv.org/abs/1910.10683)
|
|
or the [Hugging Face page](https://huggingface.co/docs/transformers/model_doc/t5).
|
|
[^2]: For more information on FLAN-T5 see the [original paper](https://arxiv.org/abs/2210.11416).
|