mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-13 08:28:55 +08:00

Go to file

Muhtasham Oblokulov 81e2a80026 Add Starcoder 2 (#502 )

* Add Starcoder2 model and update utils.py

* Refactor model arguments and modules in starcoder2.py

* Refactor FeedForward class to MLP in starcoder2.py

* Fix typo

* pre-commit

* Refactor starcoder2.py: Update model arguments and modules

* Fix LM head and MLP layers

* Rename  input layer norm

* Update bias in linear layers

* Refactor token embeddings in Starcoder2Model

* Rename to standard HF attention layer name

* Add LayerNorm

* Add transposed token embeddings (like in Gemma)

* Refactor MLP and TransformerBlock classes

* Add tie_word_embeddings option to ModelArgs and update Model implementation

* Add conditional check for tying word embeddings in Starcoder2Model

* Fix bias in lm_head linear layer

* Remove unused LayerNorm in stablelm

* Update transformers dependency to use GitHub repository

* fix lm head bug, revert transformer req

* Update RoPE initialization in Attention class

---------

Co-authored-by: Awni Hannun <awni@apple.com>

2024-03-02 19:39:23 -08:00

.circleci

Fix import warning (#479 )

2024-02-27 08:47:56 -08:00

bert

docs: added missing imports (#375 )

2024-01-25 10:44:53 -08:00

cifar

Update a few examples to use compile (#420 )

2024-02-08 13:00:41 -08:00

clip

chore(clip): update the clip example to make it compatible with HF format (#472 )

2024-02-23 06:49:53 -08:00

cvae

Update a few examples to use compile (#420 )

2024-02-08 13:00:41 -08:00

gcn

Update a few examples to use compile (#420 )

2024-02-08 13:00:41 -08:00

llava

LlaVA in MLX (#461 )

2024-03-01 10:28:35 -08:00

llms

Add Starcoder 2 (#502 )

2024-03-02 19:39:23 -08:00

lora

Bug fix in lora.py (#468 )

2024-02-20 12:53:30 -08:00

mnist

Update a few examples to use compile (#420 )

2024-02-08 13:00:41 -08:00

normalizing_flow

Update a few examples to use compile (#420 )

2024-02-08 13:00:41 -08:00

speechcommands

Update a few examples to use compile (#420 )

2024-02-08 13:00:41 -08:00

stable_diffusion

Fix Qwen2 and SD (#441 )

2024-02-14 13:43:12 -08:00

add speculative decoding example for llama (#149 )

2023-12-28 15:20:43 -08:00

transformer_lm

Typo: SGD->AdamW (#471 )

2024-02-20 15:47:17 -08:00

whisper

work with tuple shape (#393 )

2024-02-01 13:03:47 -08:00

.gitignore

Align CLI args and some smaller fixes (#167 )

2023-12-22 14:34:32 -08:00

.pre-commit-config.yaml

Update black version to 24.2.0 (#445 )

2024-02-16 06:02:52 -08:00

ACKNOWLEDGMENTS.md

Prevent llms/mlx_lm from serving the local directory as a webserver (#498 )

2024-02-27 19:40:42 -08:00

CODE_OF_CONDUCT.md

contribution + code of conduct

2023-11-29 12:31:18 -08:00

CONTRIBUTING.md

Update CONTRIBUTING.md

2023-12-09 08:02:34 +09:00

LICENSE

consistent copyright

2023-11-30 11:11:04 -08:00

README.md

LlaVA in MLX (#461 )

2024-03-01 10:28:35 -08:00

README.md

MLX Examples

This repo contains a variety of standalone examples using the MLX framework.

The MNIST example is a good starting point to learn how to use MLX.

Some more useful examples are listed below.

Text Models

Transformer language model training.
Large scale text generation with LLaMA, Mistral, Phi-2, and more in the LLMs directory.
A mixture-of-experts (MoE) language model with Mixtral 8x7B.
Parameter efficient fine-tuning with LoRA or QLoRA.
Text-to-text multi-task Transformers with T5.
Bidirectional language understanding with BERT.

Image Models

Image classification using ResNets on CIFAR-10.
Generating images with Stable Diffusion.
Convolutional variational autoencoder (CVAE) on MNIST.

Audio Models

Speech recognition with OpenAI's Whisper.

Multimodal models

Joint text and image embeddings with CLIP.
Text generation from image and text inputs with LLaVA.

Other Models

Semi-supervised learning on graph-structured data with GCN.
Real NVP normalizing flow for density estimation and sampling.

Hugging Face

Note: You can now directly download a few converted checkpoints from the MLX Community organization on Hugging Face. We encourage you to join the community and contribute new models.

Contributing

We are grateful for all of our contributors. If you contribute to MLX Examples and wish to be acknowledged, please add your name to the list in your pull request.

Citing MLX Examples

The MLX software suite was initially developed with equal contribution by Awni Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. If you find MLX Examples useful in your research and wish to cite it, please use the following BibTex entry:

@software{mlx2023,
  author = {Awni Hannun and Jagrit Digani and Angelos Katharopoulos and Ronan Collobert},
  title = {{MLX}: Efficient and flexible machine learning on Apple silicon},
  url = {https://github.com/ml-explore},
  version = {0.0},
  year = {2023},
}