Use warnings.warn() for errors instead of print(), show file count messages only with --verbose, and simplify user prompts. Keep error handling consistent with the original code, including traceback on exceptions.
Use warnings.warn() for errors instead of print(), show detailed
messages only with --verbose, and simplify user prompts. Keep
error handling consistent with the rest of the codebase.
Add support for transcribing all files in a directory recursively.
The implementation lets ffmpeg handle file validation instead of
filtering by extension. Update README with minimal documentation
for directory support.
* add support for audio and input name from stdin
* refactored to stdin - arg, and output-name template
* fix bugs, add test coverage
* fix doc to match arg rename
* some nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Make sure to import the correct "version" module when installing the
mlx_whisper package from local source code.
* Make sure to import the correct "version" module when installing the mlx_lm package from local source code
* fix
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* use nn.RMSNorm, use sdpa, cleanup
* bump mlx versions
* minor update
* use fast layer norm
* version bump
* update requirement for whisper
* update requirement for gguf
* Update README.md
The default behaviour of where the convert.py saved files was wrong. It also was inconsistent with how the later script test.py is trying to use them (and assuming naming convention).
I don't actually see a quick way to automate this since--as written--the target directory is set directly by an argument. It would probably be best to rewrite it so that the argument is used as an override variable, but the default behaviour is to construct a file path based on set and unset arugments. This also is complex because "defaults" are assumed in the naming convention as well.
* Update README.md
Created an actual script that'll run and do this correctly.
* Update README.md
Typo fix: mlx-models should have been mlx_models. This conforms with standard later in the mlx-examples/whisper code.
* Update README.md
Removed the larger script and changed it back to the simpler script as before.
* nits in readme
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Add missing keyword to the decoding options
* Reverting last commit
* Fixing transcribe keyword in benckmark.py
* Add argument name to load_model
This is intended to avoid confusion
* Add word timestamps and confidence scores
* Create a separate forward_with_cross_qk function
* Move multiple ops from np to mlx, clean comments
* Save alignment_heads
* Cast qk to fp32
* Add test for word-level timestamps and confidence scores
* format + readme
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Add option to load customized mlx model
* Add quantization
* Apply reviews
* Separate model conversion and loading
* Update test
* Fix benchmark
* Add notes about conversion
* Improve doc
* Large-v3 requires 128 Mel frequency bins
* extract correct model dimensions and use argparse
* format
* format
---------
Co-authored-by: Awni Hannun <awni@apple.com>