mlx-examples/README.md
Noah Kasmanoff a429263905
LlaVA in MLX (#461)
* add: llava mlx first draft

* add: weights comparision

* add forward pass skeleton

* update: now  imports weights correctly

* delete base

* latest

* adding config

* fix: use config

* add mlx config

* feat: add image processor for llava processor

* wip

* feat: llava working example

* chore: refactor generate script

* chore: clean up

* add: warning to user if no <image> token despite using one

* add: __call__ to LlavaModel

* add: call to LlavaModel

* update fp

* clean up var names

* update: native GeLU

* Cleanup

* update generate and readme

* remove todo comment

* rearrange tests

* fix example code

* nits in README

* update readme

* nit in readme

* nits in README

* chore(llava): refactor image embedding merging logic

* min mlx version

* nits in readmes

* fix cli prompt, some nits

* updates, slight simplify

---------

Co-authored-by: anchen <li.anchen.au@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-01 10:28:35 -08:00

72 lines
2.4 KiB
Markdown

# MLX Examples
This repo contains a variety of standalone examples using the [MLX
framework](https://github.com/ml-explore/mlx).
The [MNIST](mnist) example is a good starting point to learn how to use MLX.
Some more useful examples are listed below.
### Text Models
- [Transformer language model](transformer_lm) training.
- Large scale text generation with [LLaMA](llms/llama),
[Mistral](llms/mistral), [Phi-2](llms/phi2), and more in the [LLMs](llms)
directory.
- A mixture-of-experts (MoE) language model with [Mixtral 8x7B](llms/mixtral).
- Parameter efficient fine-tuning with [LoRA or QLoRA](lora).
- Text-to-text multi-task Transformers with [T5](t5).
- Bidirectional language understanding with [BERT](bert).
### Image Models
- Image classification using [ResNets on CIFAR-10](cifar).
- Generating images with [Stable Diffusion](stable_diffusion).
- Convolutional variational autoencoder [(CVAE) on MNIST](cvae).
### Audio Models
- Speech recognition with [OpenAI's Whisper](whisper).
### Multimodal models
- Joint text and image embeddings with [CLIP](clip).
- Text generation from image and text inputs with [LLaVA](llava).
### Other Models
- Semi-supervised learning on graph-structured data with [GCN](gcn).
- Real NVP [normalizing flow](normalizing_flow) for density estimation and
sampling.
### Hugging Face
Note: You can now directly download a few converted checkpoints from the [MLX
Community](https://huggingface.co/mlx-community) organization on Hugging Face.
We encourage you to join the community and [contribute new
models](https://github.com/ml-explore/mlx-examples/issues/155).
## Contributing
We are grateful for all of [our
contributors](ACKNOWLEDGMENTS.md#Individual-Contributors). If you contribute
to MLX Examples and wish to be acknowledged, please add your name to the list in your
pull request.
## Citing MLX Examples
The MLX software suite was initially developed with equal contribution by Awni
Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. If you find
MLX Examples useful in your research and wish to cite it, please use the following
BibTex entry:
```
@software{mlx2023,
author = {Awni Hannun and Jagrit Digani and Angelos Katharopoulos and Ronan Collobert},
title = {{MLX}: Efficient and flexible machine learning on Apple silicon},
url = {https://github.com/ml-explore},
version = {0.0},
year = {2023},
}
```