mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-09-01 21:01:32 +08:00
chore(mlx-lm): refactor server.py to utilize generate_step from utils for consistency (#491)
* chore(mlx-lm): refactor server.py to utilize generate_step from utils for consistency * chore(mlx-lm): update server doc * chore: remove unused generate func
This commit is contained in:
@@ -61,3 +61,5 @@ curl localhost:8080/v1/chat/completions \
|
||||
|
||||
- `top_p`: (Optional) A float specifying the nucleus sampling parameter.
|
||||
Defaults to `1.0`.
|
||||
- `repetition_penalty`: (Optional) Applies a penalty to repeated tokens. Defaults to `1.0`.
|
||||
- `repetition_context_size`: (Optional) The size of the context window for applying repetition penalty. Defaults to `20`.
|
Reference in New Issue
Block a user