From 96bf37008e91de86538bdacf3a12a479a322902b Mon Sep 17 00:00:00 2001
From: Matthias Neumayer <hello@matthiasneumayer.com>
Date: Fri, 14 Feb 2025 04:32:56 +0100
Subject: [PATCH] Update README.md to include how to set temperature (#1280)

* Update README.md to include how to set temperature

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
---
 llms/README.md | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/llms/README.md b/llms/README.md
index 4f7451c1..e2d1db59 100644
--- a/llms/README.md
+++ b/llms/README.md
@@ -123,6 +123,18 @@ for response in stream_generate(model, tokenizer, prompt, max_tokens=512):
 print()
 ```
 
+#### Sampling
+
+The `generate` and `stream_generate` functions accept `sampler` and
+`logits_processors` keyword arguments. A sampler is any callable which accepts
+a possibly batched logits array and returns an array of sampled tokens.  The
+`logits_processors` must be a list of callables which take the token history
+and current logits as input and return the processed logits. The logits
+processors are applied in order.
+
+Some standard sampling functions and logits processors are provided in
+`mlx_lm.sample_utils`.
+
 ### Command Line
 
 You can also use `mlx-lm` from the command line with: