mlx/python at 6f5874a2f294a6a2fa33ed4a661b453015a97820 - mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Files

Cheng 6f5874a2f2 [CUDA] Initial implementation of Convolution with cuDNN (#2385 )

* Link with cuDNN

* Initial implementation

* Remove backend apis

* Fix recording cudnn conv

* More unused backend apis

* Fix C++ conv tests

* include cudnn as python dep

* Install libcudnn9-dev-cuda-12 in CI

* cudnn only accepts contiguous inputs

* Switch to backend apis

* Plan needs to be kept alive

* Turn off tf32

* Add cache

* Test the native cuda graph api

* Set cudnn stream before execution

* Make LRUCache more like a normal container

* Do error check for cublas handle

* Zero-initilizing array

* Use tf32 for conv

* Skip TestConv.test_torch_conv_2D test

---------

Co-authored-by: Awni Hannun <awni@apple.com>

2025-07-25 08:12:10 +09:00

mlx

Adding support for the Muon Optimizer (#1914 )

2025-07-18 12:25:28 -07:00

scripts

[CUDA] Initial implementation of Convolution with cuDNN (#2385 )

2025-07-25 08:12:10 +09:00

src

Fix uv install and add dev release (#2411 )

2025-07-23 16:54:19 -07:00

tests

[CUDA] Initial implementation of Convolution with cuDNN (#2385 )

2025-07-25 08:12:10 +09:00