mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-09-01 12:49:50 +08:00
Mlx llm package (#301)
* fix converter * add recursive files * remove gitignore * remove gitignore * add packages properly * read me update * remove dup readme * relative * fix convert * fix community name * fix url * version
This commit is contained in:
110
llms/README.md
Normal file
110
llms/README.md
Normal file
@@ -0,0 +1,110 @@
|
||||
## Generate Text with LLMs and MLX
|
||||
|
||||
The easiest way to get started is to install the `mlx-lm` package:
|
||||
|
||||
```shell
|
||||
pip install mlx-lm
|
||||
```
|
||||
|
||||
### Python API
|
||||
|
||||
You can use `mlx-lm` as a module:
|
||||
|
||||
```python
|
||||
from mlx_lm import load, generate
|
||||
|
||||
model, tokenizer = load("mistralai/Mistral-7B-v0.1")
|
||||
|
||||
response = generate(model, tokenizer, prompt="hello", verbose=True)
|
||||
```
|
||||
|
||||
To see a description of all the arguments you can do:
|
||||
|
||||
```
|
||||
>>> help(generate)
|
||||
```
|
||||
|
||||
The `mlx-lm` package also comes with functionality to quantize and optionally
|
||||
upload models to the Hugging Face Hub.
|
||||
|
||||
You can convert models in the Python API with:
|
||||
|
||||
```python
|
||||
from mlx_lm import convert
|
||||
|
||||
upload_repo = "mlx-community/My-Mistral-7B-v0.1-4bit"
|
||||
|
||||
convert("mistralai/Mistral-7B-v0.1", quantize=True, upload_repo=upload_repo)
|
||||
```
|
||||
|
||||
This will generate a 4-bit quantized Mistral-7B and upload it to the
|
||||
repo `mlx-community/My-Mistral-7B-v0.1-4bit`. It will also save the
|
||||
converted model in the path `mlx_model` by default.
|
||||
|
||||
To see a description of all the arguments you can do:
|
||||
|
||||
```
|
||||
>>> help(convert)
|
||||
```
|
||||
|
||||
### Command Line
|
||||
|
||||
You can also use `mlx-lm` from the command line with:
|
||||
|
||||
```
|
||||
python -m mlx_lm.generate --model mistralai/Mistral-7B-v0.1 --prompt "hello"
|
||||
```
|
||||
|
||||
This will download a Mistral 7B model from the Hugging Face Hub and generate
|
||||
text using the given prompt.
|
||||
|
||||
For a full list of options run:
|
||||
|
||||
```
|
||||
python -m mlx_lm generate --help
|
||||
```
|
||||
|
||||
To quantize a model from the command line run:
|
||||
|
||||
```
|
||||
python -m mlx_lm.convert --hf-path mistralai/Mistral-7B-v0.1 -q
|
||||
```
|
||||
|
||||
For more options run:
|
||||
|
||||
```
|
||||
python -m mlx_lm.convert --help
|
||||
```
|
||||
|
||||
You can upload new models to Hugging Face by specifying `--upload-repo` to
|
||||
`convert`. For example, to upload a quantized Mistral-7B model to the
|
||||
[MLX Hugging Face community](https://huggingface.co/mlx-community) you can do:
|
||||
|
||||
```
|
||||
python -m mlx_lm.convert \
|
||||
--hf-path mistralai/Mistral-7B-v0.1 \
|
||||
-q \
|
||||
--upload-repo mlx-community/my-4bit-mistral \
|
||||
```
|
||||
|
||||
### Supported Models
|
||||
|
||||
The example supports Hugging Face format Mistral, Llama, and Phi-2 style
|
||||
models. If the model you want to run is not supported, file an
|
||||
[issue](https://github.com/ml-explore/mlx-examples/issues/new) or better yet,
|
||||
submit a pull request.
|
||||
|
||||
Here are a few examples of Hugging Face models that work with this example:
|
||||
|
||||
- [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
|
||||
- [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
|
||||
- [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
|
||||
- [01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)
|
||||
- [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
|
||||
|
||||
Most
|
||||
[Mistral](https://huggingface.co/models?library=transformers,safetensors&other=mistral&sort=trending),
|
||||
[Llama](https://huggingface.co/models?library=transformers,safetensors&other=llama&sort=trending),
|
||||
and
|
||||
[Phi-2](https://huggingface.co/models?library=transformers,safetensors&other=phi&sort=trending)
|
||||
style models should work out of the box.
|
Reference in New Issue
Block a user