Awni Hannun
|
95f82e67a2
|
Fix import warning (#479)
* fix import warning
* fix version import
* remove api, move convert to utils
* also update circle to run external PRs
|
2024-02-27 08:47:56 -08:00 |
|
Anchen
|
82f3f31d93
|
chore(mlx-lm): refactor server.py to utilize generate_step from utils for consistency (#491)
* chore(mlx-lm): refactor server.py to utilize generate_step from utils for consistency
* chore(mlx-lm): update server doc
* chore: remove unused generate func
|
2024-02-27 06:25:24 -08:00 |
|
Anchen
|
19a21bfce4
|
chore: add /v1/completions for server (#489)
|
2024-02-26 20:59:33 -08:00 |
|
Anchen
|
88458c4e40
|
feat(mlx-lm): add openAI like api server (#429)
* feat(mlx-lm): add openAI like api server
* chore: fix sse format
* chore: add top_p support
* chore: fix the load import
* chore: add workground for missing space in stream decoding
* chore: fix typo
* chore: add error handling for streaming
* chore: using slicing instead of replace
* chore: set host, port via args and improve handle stream token logic
* chore: refactor stop sequence function
* chore: rename stopping_criteria
* fix: unable to load kernel contiguous_scan_inclusive_sum_bfloat16_bfloat16
* chore: fix the streaming unicode issue
* Update llms/mlx_lm/server.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* refacotr: move stopping_criteria out of generate func
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
|
2024-02-18 14:01:28 -08:00 |
|