Examples in the MLX framework
mlx
Go to file
Y4hL b8e5eda4fd
Refactoring of mlx_lm example (#501)
* Use named tuple from typing for typehints

* Add type hints

* Simplify expression

* Type hint fix

* Improved do_POST logic

Use a map of endpoints to methods to reduce redundancy in code

* Fix format

* Improve redundancy

Call method dynamically instead of writing out all arguments twice

* Send response instead of returning

* Fix typo

* Revert change

* Make adapter_file as Optional

* Mark formatter as optional

* format

* Create message generator

Store response data that stays static for the duration of the response inside of the object:

system_fingerprint
request_id
object_type
requested_model

Created a message generator, that dynamically creates messages from the metadata stored inside of the object, and the data from the model pipeline

* Remove leftover

* Update parameters to reflect new object structure

No longer pass all arguments between functions, but use the stores values inside of the object

* Parse body before calling request specific methods

* Call super init

* Update server.py

* Fixed outdated documentation parameter name

* Add documentation

* Fix sending headers twice

During testing I found that when using the streaming option, headers have always been sent twice. This should fix that

* Simplify streaming code by using guard clauses

Don't wrap wfile writes in try blocks, the server class has its own try block to prevent crashing

* Bug fix

* Use Content-Length header

Let the completion type specific methods finish sending the headers. This allows us to send the Content-Length header as the model returns a completion.

* Update utils.py

* Add top_p documentation

* Type hint model and tokenizer as required

* Use static system fingerprint

System fingerprint now stays the same across requests

* Make type hint more specific

* Bug Fix

Supplying less than 2 models to merge would raise ValueError and calls len on unbound "models". Should be "model_paths" instead.

Mark upload_repo as optional

* Move more of the shared code into do_POST

Processing stop_id_sequences is done no matter the request endpoint or type, move it into the shared section. handle_ methods now just return the prompt in mx.array form.

* Store stop_id_sequences as lists instead of np

During testing I found that letting the tokenizer return values as python lists and converting them to mlx arrays was around 20% faster than having the tokenizer convert them to np, and from np to mlx. This allows makes it so numpy no longer needs to be imported.

* Update stop_id_sequences docs

* Turn if check to non-inclusive

Only continue if buffer is smaller

* Documentation fix

* Cleared method names

Instead of handle_stream and generate_competion, we should name it handle_completion.

Instead of handle_completions and handle_chat_completions, we should name it handle_text_completions, since both are completions, calling it text completions should make it more descriptive

* Make comment clearer

* fix format

* format
2024-03-06 06:24:31 -08:00
.circleci Fix import warning (#479) 2024-02-27 08:47:56 -08:00
bert docs: added missing imports (#375) 2024-01-25 10:44:53 -08:00
cifar Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
clip chore(clip): update the clip example to make it compatible with HF format (#472) 2024-02-23 06:49:53 -08:00
cvae Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
gcn Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
llava LlaVA in MLX (#461) 2024-03-01 10:28:35 -08:00
llms Refactoring of mlx_lm example (#501) 2024-03-06 06:24:31 -08:00
lora Bug fix in lora.py (#468) 2024-02-20 12:53:30 -08:00
mnist Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
normalizing_flow Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
speechcommands Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
stable_diffusion Fix Qwen2 and SD (#441) 2024-02-14 13:43:12 -08:00
t5 add speculative decoding example for llama (#149) 2023-12-28 15:20:43 -08:00
transformer_lm Typo: SGD->AdamW (#471) 2024-02-20 15:47:17 -08:00
whisper work with tuple shape (#393) 2024-02-01 13:03:47 -08:00
.gitignore Align CLI args and some smaller fixes (#167) 2023-12-22 14:34:32 -08:00
.pre-commit-config.yaml Update black version to 24.2.0 (#445) 2024-02-16 06:02:52 -08:00
ACKNOWLEDGMENTS.md Refactoring of mlx_lm example (#501) 2024-03-06 06:24:31 -08:00
CODE_OF_CONDUCT.md contribution + code of conduct 2023-11-29 12:31:18 -08:00
CONTRIBUTING.md Add tips on porting LLMs from HuggingFace (#523) 2024-03-05 17:43:15 -08:00
LICENSE consistent copyright 2023-11-30 11:11:04 -08:00
README.md LlaVA in MLX (#461) 2024-03-01 10:28:35 -08:00

MLX Examples

This repo contains a variety of standalone examples using the MLX framework.

The MNIST example is a good starting point to learn how to use MLX.

Some more useful examples are listed below.

Text Models

Image Models

Audio Models

Multimodal models

  • Joint text and image embeddings with CLIP.
  • Text generation from image and text inputs with LLaVA.

Other Models

  • Semi-supervised learning on graph-structured data with GCN.
  • Real NVP normalizing flow for density estimation and sampling.

Hugging Face

Note: You can now directly download a few converted checkpoints from the MLX Community organization on Hugging Face. We encourage you to join the community and contribute new models.

Contributing

We are grateful for all of our contributors. If you contribute to MLX Examples and wish to be acknowledged, please add your name to the list in your pull request.

Citing MLX Examples

The MLX software suite was initially developed with equal contribution by Awni Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. If you find MLX Examples useful in your research and wish to cite it, please use the following BibTex entry:

@software{mlx2023,
  author = {Awni Hannun and Jagrit Digani and Angelos Katharopoulos and Ronan Collobert},
  title = {{MLX}: Efficient and flexible machine learning on Apple silicon},
  url = {https://github.com/ml-explore},
  version = {0.0},
  year = {2023},
}