mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-06-27 03:05:20 +08:00
42 lines
1012 B
Markdown
42 lines
1012 B
Markdown
![]() |
# Qwen
|
||
|
|
||
|
Qwen (通义千问) are a family of language models developed by Alibaba Cloud.[^1]
|
||
|
The architecture of the Qwen models is similar to Llama except for the bias in
|
||
|
the attention layers.
|
||
|
|
||
|
## Setup
|
||
|
|
||
|
First download and convert the model with:
|
||
|
|
||
|
```sh
|
||
|
python convert.py
|
||
|
```
|
||
|
The script downloads the model from Hugging Face. The default model is
|
||
|
`Qwen/Qwen-1_8B`. Check out the [Hugging Face page](https://huggingface.co/Qwen) to see a list of available models.
|
||
|
|
||
|
The conversion script will make the `weights.npz` and `config.json` files in
|
||
|
the working directory.
|
||
|
|
||
|
## Generate
|
||
|
|
||
|
To generate text with the default prompt:
|
||
|
|
||
|
```sh
|
||
|
python qwen.py
|
||
|
```
|
||
|
|
||
|
If you change the model, make sure to pass the corresponding tokenizer. E.g.,
|
||
|
for Qwen 7B use:
|
||
|
|
||
|
```
|
||
|
python qwen.py --tokenizer Qwen/Qwen-7B
|
||
|
```
|
||
|
|
||
|
To see a list of options, run:
|
||
|
|
||
|
```sh
|
||
|
python qwen.py --help
|
||
|
```
|
||
|
|
||
|
[^1]: For more details on the model see the official repo of [Qwen](https://github.com/QwenLM/Qwen) and the [Hugging Face](https://huggingface.co/Qwen).
|