Anchen
|
88458c4e40
|
feat(mlx-lm): add openAI like api server (#429)
* feat(mlx-lm): add openAI like api server
* chore: fix sse format
* chore: add top_p support
* chore: fix the load import
* chore: add workground for missing space in stream decoding
* chore: fix typo
* chore: add error handling for streaming
* chore: using slicing instead of replace
* chore: set host, port via args and improve handle stream token logic
* chore: refactor stop sequence function
* chore: rename stopping_criteria
* fix: unable to load kernel contiguous_scan_inclusive_sum_bfloat16_bfloat16
* chore: fix the streaming unicode issue
* Update llms/mlx_lm/server.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* refacotr: move stopping_criteria out of generate func
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
|
2024-02-18 14:01:28 -08:00 |
|