mlx_whisper: add support for audio input from stdin (#1012)

* add support for audio and input name from stdin * refactored to stdin - arg, and output-name template * fix bugs, add test coverage * fix doc to match arg rename * some nits --------- Co-authored-by: Awni Hannun <awni@apple.com>
2025-12-16 02:08:55 +08:00 · 2024-11-04 14:02:13 -08:00
parent 3b526f0aa1
commit 4394633ce0
4 changed files with 53 additions and 26 deletions
--- a/whisper/README.md
+++ b/whisper/README.md
@@ -25,7 +25,7 @@ pip install mlx-whisper

 At its simplest:

-```
+```sh
 mlx_whisper audio_file.mp3
 ```

@@ -35,6 +35,15 @@ Use `-f` to specify the output format and `--model` to specify the model. There
 are many other supported command line options. To see them all, run
 `mlx_whisper -h`.

+You can also pipe the audio content of other programs via stdin:
+
+```sh
+some-process | mlx_whisper -
+```
+
+The default output file name will be `content.*`. You can specify the name with
+the `--output-name` flag.
+
 #### API

 Transcribe audio with:
@@ -103,7 +112,7 @@ python convert.py --help
 ```

 By default, the conversion script will make the directory `mlx_models`
-and save the converted `weights.npz` and `config.json` there. 
+and save the converted `weights.npz` and `config.json` there.

 Each time it is run, `convert.py` will overwrite any model in the provided
 path. To save different models, make sure to set `--mlx-path` to a unique