mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-08-29 03:55:20 +08:00
nits
This commit is contained in:
parent
0408925f0d
commit
2229775369
@ -64,29 +64,6 @@ prompt = tokenizer.apply_chat_template(
|
||||
text = generate(model, tokenizer, prompt=prompt, verbose=True)
|
||||
```
|
||||
|
||||
To use temperature or other sampler arguments pass it like this
|
||||
|
||||
```
|
||||
from mlx_lm import load, generate
|
||||
|
||||
model, tokenizer = load("mlx-community/Mistral-7B-Instruct-v0.3-4bit")
|
||||
|
||||
temp: 0.7
|
||||
top_p: 0.9
|
||||
top_k: 25
|
||||
sampler = make_sampler(temp, top_p,top_k)
|
||||
|
||||
prompt = "Write a story about Ada Lovelace"
|
||||
|
||||
messages = [{"role": "user", "content": prompt}]
|
||||
prompt = tokenizer.apply_chat_template(
|
||||
messages, add_generation_prompt=True
|
||||
)
|
||||
|
||||
text = generate(model, tokenizer, prompt=prompt, sampler, verbose=True)
|
||||
|
||||
```
|
||||
|
||||
To see a description of all the arguments you can do:
|
||||
|
||||
```
|
||||
@ -146,6 +123,18 @@ for response in stream_generate(model, tokenizer, prompt, max_tokens=512):
|
||||
print()
|
||||
```
|
||||
|
||||
#### Sampling
|
||||
|
||||
The `generate` and `stream_generate` functions accept `sampler` and
|
||||
`logits_processors` keyword arguments. A sampler is any callable which accepts
|
||||
a possibly batched logits array and returns an array of sampled tokens. The
|
||||
`logits_processors` must be a list of callables which take the token history
|
||||
and current logits as input and return the processed logits. The logits
|
||||
processors are applied in order.
|
||||
|
||||
Some standard sampling functions and logits processors are provided in
|
||||
`mlx_lm.sample_utils`.
|
||||
|
||||
### Command Line
|
||||
|
||||
You can also use `mlx-lm` from the command line with:
|
||||
|
Loading…
Reference in New Issue
Block a user