mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Go to file

Juarez Bochi 10a7b99e83 Add T5 and Flan-T5 example (#113 )

* Add skeleton

* Load all encoder weights

* Pass config to all modules, fix ln

* Load position bias embeddings

* Load decoder weights

* Move position biases to attention module

* translate pytorch to mx

* Fix default prompt

* Fix relative_attention_max_distance config

* No scaling, no encoder mask

* LM head

* Decode (broken after 1st token)

* Use position bias in all layers

* Utils to compare encoder output

* Fix layer norm

* Fix decoder mask

* Use position bias in decoder

* Concatenate tokens

* Remove prints

* Stop on eos

* Measure tokens/s

* with cache

* bug fix with bidirectional only for encoder, add offset to position bias

* format

* Fix T5.__call__

* Stream output

* Add argument to generate float16 npz

* Load config from HF to support any model

* Uncomment bidirectional param

* Add gitignore

* Add readme.md for t5

* Fix relative position scale

* Fix --encode-only

* Run hf_t5 with any model

* Add hf generation for comparison

* Fix type for attention mask

* Increase hf max_length

* Rescale output before projecting on vocab

* readme updates

* nits

* Pass ln2 to cross attention

* Fix example

* Fix attention for 3b model

* fp16, abstract tokenizer a bit, format

* clamp for low precision

* higher clipping, remove non-helpful casts

* default to fp32 for now

* Adds support for flan-t5

* Update t5 docs on variant support

* readme flan

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>

2023-12-18 20:25:34 -08:00

bert

Merge pull request #51 from jbarrow/main

2023-12-13 15:20:29 -08:00

cifar

typo / nits

2023-12-14 12:14:01 -08:00

gcn

fix comments before merge

2023-12-11 23:10:46 +01:00

llama

Pass few shot file name to --few-shot arg(#141 )

2023-12-18 13:30:04 -08:00

lora

fix use for llama 2 from meta (#144 )

2023-12-18 19:33:17 -08:00

mistral

mixtral runs a bit faster

2023-12-12 08:36:40 -08:00

mixtral

fix RoPE bug + minor updates

2023-12-14 21:45:25 -08:00

mnist

Adding Requirements.txt

2023-12-11 20:45:39 -06:00

phi2

Rope theta to support Coda Llama (#121 )

2023-12-15 19:51:51 -08:00

stable_diffusion

Stable diffusion - check model weights shape and support int for "attention_head_dim" (#85 )

2023-12-15 13:01:02 -08:00

Add T5 and Flan-T5 example (#113 )

2023-12-18 20:25:34 -08:00

transformer_lm

black format

2023-12-09 14:15:25 -08:00

whisper

format

2023-12-14 16:56:50 -08:00

.gitignore

Benchmark all models if user allows.

2023-12-07 00:07:42 +05:30

.pre-commit-config.yaml

a few examples

2023-11-29 08:17:26 -08:00

ACKNOWLEDGMENTS.md

Citation + contributor acknowledgments section (#136 )

2023-12-18 10:12:35 -08:00

CODE_OF_CONDUCT.md

contribution + code of conduct

2023-11-29 12:31:18 -08:00

CONTRIBUTING.md

Update CONTRIBUTING.md

2023-12-09 08:02:34 +09:00

LICENSE

consistent copyright

2023-11-30 11:11:04 -08:00

README.md

Citation + contributor acknowledgments section (#136 )

2023-12-18 10:12:35 -08:00

README.md

MLX Examples

This repo contains a variety of standalone examples using the MLX framework.

The MNIST example is a good starting point to learn how to use MLX.

Some more useful examples include:

Transformer language model training.
Large scale text generation with LLaMA, Mistral or Phi.
Mixture-of-experts (MoE) language model with Mixtral 8x7B
Parameter efficient fine-tuning with LoRA.
Generating images with Stable Diffusion.
Speech recognition with OpenAI's Whisper.
Bidirectional language understanding with BERT
Semi-supervised learning on graph-structured data with GCN.

Contributing

We are grateful for all of our contributors. If you contribute to MLX Examples and wish to be acknowledged, please add your name to to the list in your pull request.

Citing MLX Examples

The MLX software suite was initially developed with equal contribution by Awni Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. If you find MLX Examples useful in your research and wish to cite it, please use the following BibTex entry:

@software{mlx2023,
  author = {Awni Hannun and Jagrit Digani and Angelos Katharopoulos and Ronan Collobert},
  title = {{MLX}: Efficient and flexible machine learning on Apple silicon},
  url = {https://github.com/ml-explore},
  version = {0.0},
  year = {2023},
}