mlx-examples/llms/deepseek-coder/README.md

# Deepseek Coder

Deepseek Coder is a family of code generating language models based on the
Llama architecture.[^1] The models were trained from scratch on a corpus of 2T
tokens, with a composition of 87% code and 13% natural language containing both
English and Chinese.

### Setup

Install the dependencies:

```
pip install -r requirements.txt
```

Next, download and convert the model. 

```sh
python convert.py --hf-path <path_to_huggingface_model>
```

To generate a 4-bit quantized model, use `-q`. For a full list of options run:

```
python convert.py --help
```

The converter downloads the model from Hugging Face. The default model is
`deepseek-ai/deepseek-coder-6.7b-instruct`. Check out the [Hugging Face
page](https://huggingface.co/deepseek-ai) to see a list of available models.

By default, the conversion script will save the converted `weights.npz`,
tokenizer, and `config.json` in the `mlx_model` directory.

> [!TIP] Alternatively, you can also download a few converted checkpoints from
> the [MLX Community](https://huggingface.co/mlx-community) organization on
> Hugging Face and skip the conversion step.

### Run

Once you've converted the weights, you can interact with the Deepseek coder
model:

```
python deepseek_coder.py --prompt "write a quick sort algorithm in python."
```

[^1]: For more information [blog post](https://deepseekcoder.github.io/) by
  DeepSeek AI
add deepseek coder example (#172) * feat: add example for deepseek coder * chore: remove hardcoded rope_scaling_factor * feat: add quantization support * chore: update readme * chore: clean up the rope scalling factor param in create cos sin theta * feat: add repetition_penalty * style /consistency changes to ease future integration * nits in README * one more typo --------- Co-authored-by: Awni Hannun <awni@apple.com> 2023-12-29 13:42:22 +08:00			`# Deepseek Coder`

			`Deepseek Coder is a family of code generating language models based on the`
			`Llama architecture.[^1] The models were trained from scratch on a corpus of 2T`
			`tokens, with a composition of 87% code and 13% natural language containing both`
			`English and Chinese.`

			`### Setup`

			`Install the dependencies:`

			```
			`pip install -r requirements.txt`
			```

			`Next, download and convert the model.`

			```sh
			`python convert.py --hf-path <path_to_huggingface_model>`
			```

			To generate a 4-bit quantized model, use `-q`. For a full list of options run:

			```
			`python convert.py --help`
			```

			`The converter downloads the model from Hugging Face. The default model is`
			`deepseek-ai/deepseek-coder-6.7b-instruct`. Check out the [Hugging Face
			`page](https://huggingface.co/deepseek-ai) to see a list of available models.`

			By default, the conversion script will save the converted `weights.npz`,
			tokenizer, and `config.json` in the `mlx_model` directory.

Qlora (#219) qlora 2024-01-05 13:05:59 +08:00			`> [!TIP] Alternatively, you can also download a few converted checkpoints from`
			`> the [MLX Community](https://huggingface.co/mlx-community) organization on`
			`> Hugging Face and skip the conversion step.`

add deepseek coder example (#172) * feat: add example for deepseek coder * chore: remove hardcoded rope_scaling_factor * feat: add quantization support * chore: update readme * chore: clean up the rope scalling factor param in create cos sin theta * feat: add repetition_penalty * style /consistency changes to ease future integration * nits in README * one more typo --------- Co-authored-by: Awni Hannun <awni@apple.com> 2023-12-29 13:42:22 +08:00			`### Run`

			`Once you've converted the weights, you can interact with the Deepseek coder`
			`model:`

			```
			`python deepseek_coder.py --prompt "write a quick sort algorithm in python."`
			```

			`[^1]: For more information [blog post](https://deepseekcoder.github.io/) by`
			`DeepSeek AI`