Jagrit Digani
210400d112
[WIP] Init NAX attention
2025-11-19 11:15:52 -08:00
Jagrit Digani
bcc923acb1
[WIP] Init NAX matmuls
2025-11-19 11:15:52 -08:00
Awni Hannun
f46877bc08
more accurate rope fallback ( #2792 )
2025-11-19 06:07:21 -08:00
Cheng
6f35017d1b
[CUDA] cuDNN backward attention ( #2762 )
Build and Test / check_lint (push) Has been cancelled
Build and Test / linux_build_and_test (ubuntu-22.04) (push) Has been cancelled
Build and Test / linux_build_and_test (ubuntu-22.04-arm) (push) Has been cancelled
Build and Test / mac_build_and_test (14.0) (push) Has been cancelled
Build and Test / mac_build_and_test (15.0) (push) Has been cancelled
Build and Test / cuda_build_and_test (cuda-12.6) (push) Has been cancelled
Build and Test / cuda_build_and_test (cuda-12.9) (push) Has been cancelled
Build and Test / build_documentation (push) Has been cancelled
Build and Test / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Build and Test / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
2025-11-19 08:13:50 +09:00
Cheng
940f4c7818
Fix building with CUDA < 12.8 ( #2782 )
Build and Test / check_lint (push) Has been cancelled
Build and Test / linux_build_and_test (ubuntu-22.04) (push) Has been cancelled
Build and Test / linux_build_and_test (ubuntu-22.04-arm) (push) Has been cancelled
Build and Test / mac_build_and_test (14.0) (push) Has been cancelled
Build and Test / mac_build_and_test (15.0) (push) Has been cancelled
Build and Test / cuda_build_and_test (cuda-12.6) (push) Has been cancelled
Build and Test / cuda_build_and_test (cuda-12.9) (push) Has been cancelled
Build and Test / build_documentation (push) Has been cancelled
Build and Test / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Build and Test / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
2025-11-18 12:55:19 +09:00
Cheng
32b18d8b66
Use std::optional for mask_arr arg ( #2763 )
Build and Test / check_lint (push) Has been cancelled
Build and Test / linux_build_and_test (ubuntu-22.04) (push) Has been cancelled
Build and Test / linux_build_and_test (ubuntu-22.04-arm) (push) Has been cancelled
Build and Test / mac_build_and_test (14.0) (push) Has been cancelled
Build and Test / mac_build_and_test (15.0) (push) Has been cancelled
Build and Test / cuda_build_and_test (cuda-12.8) (push) Has been cancelled
Build and Test / cuda_build_and_test (cuda-12.9) (push) Has been cancelled
Build and Test / build_documentation (push) Has been cancelled
Build and Test / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Build and Test / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (cuda-12.8) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (cuda-12.9) (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
2025-11-17 10:43:33 +09:00
Awni Hannun
aad49f932f
[CUDA] Tune ops per buffer based on device ( #2761 )
...
* tune ops per buffer based on device
* tune memory limit as well
* add tuning for spark
2025-11-16 06:29:49 -08:00
Cheng
9ac7dbe877
Fix MPI distributed tests with CUDA backend ( #2775 )
2025-11-16 07:12:18 +09:00
Awni Hannun
1bf605d56d
use arch specific targets when possible ( #2771 )
2025-11-14 20:04:18 -08:00
Awni Hannun
27ff069175
Fix exporting with constants ( #2769 )
2025-11-14 12:52:08 -08:00
Cheng
3b2ffcefc3
[CUDA] cuDNN forward attention ( #2743 )
...
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04-arm) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
* Separate sdpa kernels in another file
* Initial support for cuDNN SDPA
* Diable a few corner cases
* Remove scaled_dot_product_attention.h
* Use cuDNN attention for prefilling
* cuDNN SDPA requires Ampere and later
* Address reviews
* Do contiguous copy of inputs
2025-11-14 09:23:56 +09:00
Cheng
b704e9e77a
[CUDA] Check CUDA error in synchronize ( #2757 )
2025-11-14 07:10:23 +09:00
Awni Hannun
66519fb348
fix slice ( #2758 )
2025-11-13 11:30:02 -08:00
Awni Hannun
8973550ff3
export custom kernel ( #2756 )
2025-11-13 11:29:50 -08:00
Awni Hannun
75819d70ea
patch bump ( #2750 )
2025-11-11 08:49:14 -08:00
Pedro Cuenca
eba6a9d163
Compatibility with pip-installed openmpi ( #2741 )
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
2025-11-07 16:58:31 -08:00
CCYeh
be9e2aebd6
Shapeless support for zeros/ones_like ( #2726 )
...
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
* shapeless support for zeros/ones_like
* Improvements
* fix access after moved
2025-11-06 19:12:20 -08:00
Awni Hannun
df58b4133a
[CUDA] Reduce use of managed memory ( #2725 )
...
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
* Use async cuda malloc managed with cuda 13
* add pool threshold
* refactor for regular cuda malloc
* load eval gpu for cuda
* remove use of cuda pool, use cuda free async
* fix
* fix
* fix
* fix
* fix + comment
2025-11-05 16:05:23 -08:00
Anastasiia Filippova
27778156dc
Nccl reduce scatter, all gather ( #2727 )
...
* Added reduce scatter and all gather for nccl
* fix unused import, delete unused file
* small fix
* deleted useless condition
* fixed comments
* fix bug in eval_gpu, renamed to sum_scatter, fix docs
* final fix docs
* remove and
* Update mlx/distributed/mpi/mpi.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* fix broken set input output
* fixes set output
* typo
* fix typo
* no cpu, no gpu for reduce scatter
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
2025-11-05 08:21:11 -08:00
Angelos Katharopoulos
6ece97f69b
Make cpu binary_op easily accessible ( #2733 )
2025-11-05 01:08:41 -08:00
Awni Hannun
26ceb507eb
only build for macos 14 and up ( #2731 )
...
* only build for macos 14 and up
* bump metal cpp
2025-11-04 09:44:15 -08:00
Harsh Sutaria
50fa315d18
Fix addmm with empty matrices and beta != 1.0 ( #2715 )
2025-11-03 14:16:15 -08:00
AN Long
1ff2b713b6
Check isnan in maximum / minimum with CPU backend ( #2652 )
...
* Check isnan in maximum / minimum with CPU backend
* Add tests
* fix
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2025-11-03 08:51:14 -08:00
Awni Hannun
93d76b0f30
Fix compile multi capture ( #2678 )
...
* fix compile when compiling multiple lambdas with the same capture
* add test
2025-11-03 06:33:43 -08:00
David Koski
78678de0cd
add null check -- the bundleIdentifier is optional ( #2709 )
...
* add null check -- the bundleIdentifier is optional
* use variable
2025-11-03 06:33:21 -08:00
Awni Hannun
39b04ce638
use faster dequant for fp4 qmv ( #2720 )
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
2025-10-31 11:49:59 -07:00
Awni Hannun
68c5fa1c95
fix memory count bug ( #2717 )
2025-10-30 14:27:15 -07:00
Awni Hannun
ec72b44417
Add quantize/dequantize for mxfp8 and nvfp4 ( #2688 )
...
* Add quantize/dequantize slow path for mxfp8 and nvfp4
* fast cuda kernel for mx/nv quantization
* fallback for cuda < 12.8 (#2697 )
* format (#2700 )
* fix (#2701 )
* metal kernels
* docs
* fix jit
* add default bits and group sizes
* improve quant docs
* fix output type of mxfp4 matmuls
2025-10-28 16:23:12 -07:00
Melissa Kilby
460691a0e8
fix: linux-{fedora}x86_64-build ( #2707 )
...
Signed-off-by: Melissa Kilby <mkilby@apple.com >
2025-10-27 16:36:08 -07:00
Awni Hannun
969924cc69
Fp8 conversion ( #2686 )
...
* add fp8 e4m3 converters
* add cuda
* default saturate to min/max
* fix for older OS
* fix no gpu/cpu
* fix saturate
* fix compile
2025-10-27 16:35:50 -07:00
Awni Hannun
539d8322d1
add median op ( #2705 )
2025-10-27 11:33:42 -07:00
Awni Hannun
c4767d110f
fix addmm cpu ( #2699 )
2025-10-27 11:33:32 -07:00
David Koski
895217f25b
optionally load metallib from framework ( #2702 )
...
* optionally load metallib from framework
* pre-commit
* adjust logic
2025-10-27 07:52:03 -07:00
Manuel Villanueva
0cfeeb60ca
Einsum error msg improvement ( #2690 )
...
* Improved error message for Einsum
* Modifications via pre-commit
* format
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2025-10-27 06:31:47 -07:00
Ronan Collobert
8f8af61a37
fix warnings showing up with -Wall ( #2692 )
2025-10-24 11:43:35 -07:00
Awni Hannun
5bcf3a6794
format
2025-10-22 16:08:47 -07:00
wickedcoder
7707196297
Merge commit from fork
...
* add length validation to the header
* fix accessing out of bound index with .at()
2025-10-22 15:31:25 -07:00
wickedcoder
7e3471c987
Merge commit from fork
...
* add tensor->weights_data validation
* add null pointer check for tensor
2025-10-22 15:31:03 -07:00
Awni Hannun
9f0ba3ddf1
patch bump ( #2680 )
2025-10-17 12:12:07 -07:00
Awni Hannun
4bce5f9b2d
suppress gcc 10.1 warnings ( #2679 )
...
* suppress gcc 10.1 warnings
* suppress gcc 10.1 warnings
2025-10-17 12:09:21 -07:00
Anastasiia Filippova
e9eab527eb
Nccl timeout ( #2673 )
...
* print the error & delete nccl group
* timeout for nccl binding
* typo
* revert error
* fixed a typo
2025-10-14 12:29:54 -07:00
Awni Hannun
36ca62dba8
remove unused unary file ( #2672 )
2025-10-13 19:36:26 -07:00
Manuel Villanueva
9cbb1b0148
Modified sort behavior when running CPU or Metal to match NumPy/JAX ( #2667 )
...
* Modified sort behavior when running CPU or Metal to match NumPy/JAX sorting behavior.
* Modified sort behavior when running CPU or Metal to match NumPy/JAX
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2025-10-13 14:36:45 -07:00
Awni Hannun
25e2356316
speed up scalars ( #2669 )
2025-10-13 12:10:15 -07:00
Awni Hannun
226a1d24e0
Debug cuda conv ( #2662 )
...
* use t4
* use t4
2025-10-10 16:12:47 -07:00
Awni Hannun
630350ad3e
Precise sigmoid ( #2659 )
...
* bump patch
* Sigmoid matches PyTorch and is more precise on tails
2025-10-10 10:05:23 -07:00
Awni Hannun
380aeb58ae
enable admm low-precision cpu ( #2661 )
2025-10-10 09:50:54 -07:00
Awni Hannun
f37389d100
bump patch ( #2658 )
2025-10-10 08:36:41 -07:00
Awni Hannun
e89e8b4272
Export with callback ( #2612 )
...
* export with callback
* export with callback
* Add types, fix kwarg ordering bug + test
* cleanup, test, fix
* typos
2025-10-08 19:24:33 -07:00
AN Long
85a8824a8c
Fix cumulative operations when axis=None ( #2653 )
2025-10-08 15:25:38 -07:00