mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-08-10 11:16:40 +08:00

Examples in the MLX framework

mlx

Go to file

Y4hL b8e5eda4fd Refactoring of mlx_lm example (#501 ) * Use named tuple from typing for typehints * Add type hints * Simplify expression * Type hint fix * Improved do_POST logic Use a map of endpoints to methods to reduce redundancy in code * Fix format * Improve redundancy Call method dynamically instead of writing out all arguments twice * Send response instead of returning * Fix typo * Revert change * Make adapter_file as Optional * Mark formatter as optional * format * Create message generator Store response data that stays static for the duration of the response inside of the object: system_fingerprint request_id object_type requested_model Created a message generator, that dynamically creates messages from the metadata stored inside of the object, and the data from the model pipeline * Remove leftover * Update parameters to reflect new object structure No longer pass all arguments between functions, but use the stores values inside of the object * Parse body before calling request specific methods * Call super init * Update server.py * Fixed outdated documentation parameter name * Add documentation * Fix sending headers twice During testing I found that when using the streaming option, headers have always been sent twice. This should fix that * Simplify streaming code by using guard clauses Don't wrap wfile writes in try blocks, the server class has its own try block to prevent crashing * Bug fix * Use Content-Length header Let the completion type specific methods finish sending the headers. This allows us to send the Content-Length header as the model returns a completion. * Update utils.py * Add top_p documentation * Type hint model and tokenizer as required * Use static system fingerprint System fingerprint now stays the same across requests * Make type hint more specific * Bug Fix Supplying less than 2 models to merge would raise ValueError and calls len on unbound "models". Should be "model_paths" instead. Mark upload_repo as optional * Move more of the shared code into do_POST Processing stop_id_sequences is done no matter the request endpoint or type, move it into the shared section. handle_ methods now just return the prompt in mx.array form. * Store stop_id_sequences as lists instead of np During testing I found that letting the tokenizer return values as python lists and converting them to mlx arrays was around 20% faster than having the tokenizer convert them to np, and from np to mlx. This allows makes it so numpy no longer needs to be imported. * Update stop_id_sequences docs * Turn if check to non-inclusive Only continue if buffer is smaller * Documentation fix * Cleared method names Instead of handle_stream and generate_competion, we should name it handle_completion. Instead of handle_completions and handle_chat_completions, we should name it handle_text_completions, since both are completions, calling it text completions should make it more descriptive * Make comment clearer * fix format * format		2024-03-06 06:24:31 -08:00
.circleci	Fix import warning (#479 )	2024-02-27 08:47:56 -08:00
bert	docs: added missing imports (#375 )	2024-01-25 10:44:53 -08:00
cifar	Update a few examples to use compile (#420 )	2024-02-08 13:00:41 -08:00
clip	chore(clip): update the clip example to make it compatible with HF format (#472 )	2024-02-23 06:49:53 -08:00
cvae	Update a few examples to use compile (#420 )	2024-02-08 13:00:41 -08:00
gcn	Update a few examples to use compile (#420 )	2024-02-08 13:00:41 -08:00
llava	LlaVA in MLX (#461 )	2024-03-01 10:28:35 -08:00
llms	Refactoring of mlx_lm example (#501 )	2024-03-06 06:24:31 -08:00
lora	Bug fix in lora.py (#468 )	2024-02-20 12:53:30 -08:00
mnist	Update a few examples to use compile (#420 )	2024-02-08 13:00:41 -08:00
normalizing_flow	Update a few examples to use compile (#420 )	2024-02-08 13:00:41 -08:00
speechcommands	Update a few examples to use compile (#420 )	2024-02-08 13:00:41 -08:00
stable_diffusion	Fix Qwen2 and SD (#441 )	2024-02-14 13:43:12 -08:00
t5	add speculative decoding example for llama (#149 )	2023-12-28 15:20:43 -08:00
transformer_lm	Typo: SGD->AdamW (#471 )	2024-02-20 15:47:17 -08:00
whisper	work with tuple shape (#393 )	2024-02-01 13:03:47 -08:00
.gitignore	Align CLI args and some smaller fixes (#167 )	2023-12-22 14:34:32 -08:00
.pre-commit-config.yaml	Update black version to 24.2.0 (#445 )	2024-02-16 06:02:52 -08:00
ACKNOWLEDGMENTS.md	Refactoring of mlx_lm example (#501 )	2024-03-06 06:24:31 -08:00
CODE_OF_CONDUCT.md	contribution + code of conduct	2023-11-29 12:31:18 -08:00
CONTRIBUTING.md	Add tips on porting LLMs from HuggingFace (#523 )	2024-03-05 17:43:15 -08:00
LICENSE	consistent copyright	2023-11-30 11:11:04 -08:00
README.md	LlaVA in MLX (#461 )	2024-03-01 10:28:35 -08:00

README.md

MLX Examples

This repo contains a variety of standalone examples using the MLX framework.

The MNIST example is a good starting point to learn how to use MLX.

Some more useful examples are listed below.

Text Models

Transformer language model training.
Large scale text generation with LLaMA, Mistral, Phi-2, and more in the LLMs directory.
A mixture-of-experts (MoE) language model with Mixtral 8x7B.
Parameter efficient fine-tuning with LoRA or QLoRA.
Text-to-text multi-task Transformers with T5.
Bidirectional language understanding with BERT.

Image Models

Image classification using ResNets on CIFAR-10.
Generating images with Stable Diffusion.
Convolutional variational autoencoder (CVAE) on MNIST.

Audio Models

Speech recognition with OpenAI's Whisper.

Multimodal models

Joint text and image embeddings with CLIP.
Text generation from image and text inputs with LLaVA.

Other Models

Semi-supervised learning on graph-structured data with GCN.
Real NVP normalizing flow for density estimation and sampling.

Hugging Face

Note: You can now directly download a few converted checkpoints from the MLX Community organization on Hugging Face. We encourage you to join the community and contribute new models.

Contributing

We are grateful for all of our contributors. If you contribute to MLX Examples and wish to be acknowledged, please add your name to the list in your pull request.

Citing MLX Examples

The MLX software suite was initially developed with equal contribution by Awni Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. If you find MLX Examples useful in your research and wish to cite it, please use the following BibTex entry:

@software{mlx2023,
  author = {Awni Hannun and Jagrit Digani and Angelos Katharopoulos and Ronan Collobert},
  title = {{MLX}: Efficient and flexible machine learning on Apple silicon},
  url = {https://github.com/ml-explore},
  version = {0.0},
  year = {2023},
}