mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Files

Junyi Mei 62b455f801 Add Qwen example (#134 )

* Add qwen model draft

* Add readme and requirements for qwen example

* Add model and tokenizer options

* Fix convert and tokenizer

* some updates / style consistency

* move to llm subdir

* readme nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>

2023-12-19 13:06:19 -08:00

.gitignore

Add Qwen example (#134 )

2023-12-19 13:06:19 -08:00

convert.py

Add Qwen example (#134 )

2023-12-19 13:06:19 -08:00

qwen.py

Add Qwen example (#134 )

2023-12-19 13:06:19 -08:00

README.md

Add Qwen example (#134 )

2023-12-19 13:06:19 -08:00

requirements.txt

Add Qwen example (#134 )

2023-12-19 13:06:19 -08:00

README.md

Qwen

Qwen (通义千问) are a family of language models developed by Alibaba Cloud.¹ The architecture of the Qwen models is similar to Llama except for the bias in the attention layers.

Setup

First download and convert the model with:

python convert.py

The script downloads the model from Hugging Face. The default model is Qwen/Qwen-1_8B. Check out the Hugging Face page to see a list of available models.

The conversion script will make the weights.npz and config.json files in the working directory.

Generate

To generate text with the default prompt:

python qwen.py

If you change the model, make sure to pass the corresponding tokenizer. E.g., for Qwen 7B use:

python qwen.py --tokenizer  Qwen/Qwen-7B

To see a list of options, run:

python qwen.py --help

For more details on the model see the official repo of Qwen and the Hugging Face. ↩︎