Commit Graph

45 Commits

Author SHA1 Message Date
Awni Hannun
bded1a8fcd
fix looping in whisper (#1273) 2025-02-10 13:04:35 -08:00
Awni Hannun
f58c7de901
Some improvements to speedup alignment computation in MLX Whisper (#1259)
* some improvements to speedup alignment computation in MLX Whisper

* fix alignment
2025-02-08 15:47:00 -08:00
Remixer Dec
adaab81029
Allow converting models from local directories (#1118) 2024-11-24 16:41:06 -08:00
Anthony Wu
4394633ce0
mlx_whisper: add support for audio input from stdin (#1012)
* add support for audio and input name from stdin

* refactored to stdin - arg, and output-name template

* fix bugs, add test coverage

* fix doc to match arg rename

* some nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-11-04 14:02:13 -08:00
Awni Hannun
29c954f4cb
fix (#1082) 2024-11-02 13:51:38 -07:00
Awni Hannun
8160e0c4e5
Whisper improvements (#1080)
* use safetensors in whisper

* speed up decoder

* version
2024-11-01 10:52:28 -07:00
Awni Hannun
9bc53fc210
convert (#1006) 2024-10-02 13:13:33 -07:00
James Zhao
bf921afcbe
Make sure to import the correct "version" module when installing mlx_whisper and mlx_lm from local source code. (#969)
* Make sure to import the correct "version" module when installing the
mlx_whisper package from local source code.

* Make sure to import the correct "version" module when installing the mlx_lm package from local source code

* fix

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-03 13:16:21 -07:00
madroid
e196fa3208
Whisper: Support command line (#746)
* Whisper: Add CLI command

* Whisper: Prevent precision loss when converting to words dictionary

* Whisper: disable json ensure_ascii

* Whisper: add cli setup config

* Whisper: pre-commit

* Whisper: Adjust the _ in the command line arguments to -

* nits

* version + readme

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-08-16 10:35:44 -07:00
Awni Hannun
95840f32e2
Fix whipser conversion for safetensors models (#935)
* fix whipser conversion for safetensor only. error in mlx lm for existing paths

* fix tests
2024-08-14 10:22:04 -07:00
Awni Hannun
33905447f9
Whisper updates to allow HF models (#923)
* simplify conversion and update convert for HF models

* use npz for compat

* fixes

* fixes

* fix gguf

* allow user supplied path
2024-08-09 11:11:58 -07:00
Awni Hannun
c5da302fc4
gpu featurization (#824) 2024-06-07 08:59:44 -07:00
Awni Hannun
e92de216fd
rid warning (#789) 2024-05-20 06:05:33 -07:00
madroid
6775d6cb3f
Whisper: Add pip distribution configuration to support pip installations. (#739)
* Whisper: rename whisper to mlx_whisper

* Whisper: add setup.py config for publish

* Whisper: add assets data to setup config

* Whisper: pre-commit for setup.py

* Whisper: Update README.md

* Whisper: Update README.md

* nits

* fix package data

* nit in readme

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-05-01 09:00:02 -07:00
Awni Hannun
6abdbe3be8
Fix quant in gguf (#698)
* fix quant in gguf

* fix whisper
2024-04-19 20:07:11 -07:00
Awni Hannun
2146bcd7ee
Quantize embedding / Update quantize API (#680)
* more async eval

* quantize embedding / update quantize api

* more updates for quantize

* update for quantize embeddings

* update sd quant API

* update sdxl quants

* error for datasets < batch_size

* async

* fix config loading

* fix quant

* fix tests

* fix req

* remove lm head if tie weights is true

* fix test
2024-04-18 18:16:10 -07:00
dmdaksh
7d7e236061
- Removed unused Python imports (#683)
- bert/model.py:10: tree_unflatten
  - bert/model.py:2: dataclass
  - bert/model.py:8: numpy
  - cifar/resnet.py:6: Any
  - clip/model.py:15: tree_flatten
  - clip/model.py:9: Union
  - gcn/main.py:8: download_cora
  - gcn/main.py:9: cross_entropy
  - llms/gguf_llm/models.py:12: tree_flatten, tree_unflatten
  - llms/gguf_llm/models.py:9: numpy
  - llms/mixtral/mixtral.py:12: tree_map
  - llms/mlx_lm/models/dbrx.py:2: Dict, Union
  - llms/mlx_lm/tuner/trainer.py:5: partial
  - llms/speculative_decoding/decoder.py:1: dataclass, field
  - llms/speculative_decoding/decoder.py:2: Optional
  - llms/speculative_decoding/decoder.py:5: mlx.nn
  - llms/speculative_decoding/decoder.py:6: numpy
  - llms/speculative_decoding/main.py:2: glob
  - llms/speculative_decoding/main.py:3: json
  - llms/speculative_decoding/main.py:5: Path
  - llms/speculative_decoding/main.py:8: mlx.nn
  - llms/speculative_decoding/model.py:6: tree_unflatten
  - llms/speculative_decoding/model.py:7: AutoTokenizer
  - llms/tests/test_lora.py:13: yaml_loader
  - lora/lora.py:14: tree_unflatten
  - lora/models.py:11: numpy
  - lora/models.py:3: glob
  - speechcommands/kwt.py:1: Any
  - speechcommands/main.py:7: mlx.data
  - stable_diffusion/stable_diffusion/model_io.py:4: partial
  - whisper/benchmark.py:5: sys
  - whisper/test.py:5: subprocess
  - whisper/whisper/audio.py:6: Optional
  - whisper/whisper/decoding.py:8: mlx.nn
2024-04-16 07:50:32 -07:00
Awni Hannun
78c431dc25
cleanup whisper a little (#639) 2024-03-30 13:13:58 -07:00
Awni Hannun
b8a348c1b8
Switch to fast RMS/LN Norm (#603)
* use nn.RMSNorm, use sdpa, cleanup

* bump mlx versions

* minor update

* use fast layer norm

* version bump

* update requirement for whisper

* update requirement for gguf
2024-03-23 07:13:51 -07:00
amcox886
ef32379bc6
Update README.md (#530)
* Update README.md

The default behaviour of where the convert.py saved files was wrong. It also was inconsistent with how the later script test.py is trying to use them (and assuming naming convention). 

I don't actually see a quick way to automate this since--as written--the  target directory is set directly by an argument. It would probably be best to rewrite it so that the argument is used as an override variable, but the default behaviour is to construct a file path based on set and unset arugments. This also is complex because "defaults" are assumed in the naming convention as well.

* Update README.md

Created an actual script that'll run and do this correctly.

* Update README.md

Typo fix: mlx-models should have been mlx_models. This conforms with standard later in the mlx-examples/whisper code.

* Update README.md

Removed the larger script and changed it back to the simpler script as before.

* nits in readme

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-07 06:23:43 -08:00
Awni Hannun
ec14583c2a
work with tuple shape (#393) 2024-02-01 13:03:47 -08:00
Yousif
7575125d5d
Added lora support for Phi-2 (#302)
* Added lora support for Phi-2

* Added Phi-2 support in fuse and convert

* format + readme

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-12 13:45:30 -08:00
Alexandre Boucaud
3ac731dd4f
Fix TypeError in whisper benchmark script (#306)
* Add missing keyword to the decoding options

* Reverting last commit

* Fixing transcribe keyword in benckmark.py

* Add argument name to load_model

This is intended to avoid confusion
2024-01-12 13:07:15 -08:00
Awni Hannun
c1342b8e89
Use pip for mlx data with speech commands (#307)
* update to use pypi mlx data

* nit in readme
2024-01-12 11:06:33 -08:00
Awni Hannun
80d18671ad
[Lora] Fix generate (#282)
* fix generate

* update readme, fix test, better default

* nits

* typo
2024-01-10 16:13:06 -08:00
Vaibhav Srivastav
bb35e878cb
[Whisper] Add load from Hub. (#255)
* Add load from Hub.

* Up.
2024-01-08 06:20:00 -08:00
Vaibhav Srivastav
d4c3a9cb54
[Whisper] Add HF Hub upload option. (#254)
* Add HF Hub upload option.

* up.

* Add missing requirements.
2024-01-08 06:18:24 -08:00
bofeng huang
bf9926489e
[Whisper] Add word timestamps and confidence scores (#201)
* Add word timestamps and confidence scores

* Create a separate forward_with_cross_qk function

* Move multiple ops from np to mlx, clean comments

* Save alignment_heads

* Cast qk to fp32

* Add test for word-level timestamps and confidence scores

* format + readme

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-07 10:01:29 -08:00
Awni Hannun
a5d6d0436c
Support Hugging Face models (#215)
* support hf direct models
2024-01-03 15:13:26 -08:00
bofeng huang
581a5733a1
[Whisper] Load customized MLX model & Quantization (#191)
* Add option to load customized mlx model

* Add quantization

* Apply reviews

* Separate model conversion and loading

* Update test

* Fix benchmark

* Add notes about conversion

* Improve doc
2023-12-29 10:22:15 -08:00
Dimo
07c163d9d9
[Whisper] Large-v3 requires 128 Mel frequency bins (#193)
* Large-v3 requires 128 Mel frequency bins

* extract correct model dimensions and use argparse

* format

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 13:50:35 -08:00
bofeng huang
e1e56a625b
Fix benchmark (#200) 2023-12-28 11:29:39 -08:00
Awni Hannun
27c0a8c002
Add llms subdir + update README (#145)
* add llms subdir + update README

* nits

* use same pre-commit as mlx

* update readmes a bit

* format
2023-12-20 10:22:25 -08:00
Awni Hannun
b863e7cca0 format 2023-12-14 16:56:50 -08:00
Stv.X
cbae83e011 Corrected spelling of terms in whisper/README.md 2023-12-14 08:15:26 +08:00
bofenghuang
4b1a06c0cb Fix fp16 2023-12-13 11:07:47 +01:00
Awni Hannun
74c4ed40d2
Merge pull request #76 from bofenghuang/add-whisper-large-v3
Add whisper-large-v3
2023-12-12 20:22:31 -08:00
bofenghuang
94705ed38b Add large v3 2023-12-12 17:26:52 +01:00
Awni Hannun
6e723a015a whisper default in fp16 2023-12-12 07:37:35 -08:00
Awni Hannun
172a60056f update whisper readme and requirements 2023-12-07 13:01:44 -08:00
Awni Hannun
54952a0d80
Merge pull request #12 from chatgpt-1/main
Fix: timestamp extraction bug in transcribe function
2023-12-07 08:53:30 -08:00
adhishthite
9cf82a0d43 Benchmark all models if user allows. 2023-12-07 00:07:42 +05:30
crackerben
6cbc029450 Fix timestamp extraction bug in transcribe function 2023-12-06 20:34:30 +08:00
Awni Hannun
31bc57c4ff add copyright in source 2023-11-30 11:08:53 -08:00
Awni Hannun
b243c1d8f4 a few examples 2023-11-29 08:17:26 -08:00