Files
mlx/mlx
Cheng 6f5874a2f2 [CUDA] Initial implementation of Convolution with cuDNN (#2385)
* Link with cuDNN

* Initial implementation

* Remove backend apis

* Fix recording cudnn conv

* More unused backend apis

* Fix C++ conv tests

* include cudnn as python dep

* Install libcudnn9-dev-cuda-12 in CI

* cudnn only accepts contiguous inputs

* Switch to backend apis

* Plan needs to be kept alive

* Turn off tf32

* Add cache

* Test the native cuda graph api

* Set cudnn stream before execution

* Make LRUCache more like a normal container

* Do error check for cublas handle

* Zero-initilizing array

* Use tf32 for conv

* Skip TestConv.test_torch_conv_2D test

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-07-25 08:12:10 +09:00
..
2023-11-29 10:52:08 -08:00
2025-04-24 06:14:49 -07:00
2025-04-23 13:08:28 -07:00
2025-03-11 06:30:44 -07:00
2025-05-18 00:18:43 -07:00
2025-02-13 18:46:11 -08:00
2025-04-30 09:08:17 -07:00
2025-04-30 09:08:17 -07:00
2025-02-25 06:00:53 -08:00
2025-02-07 15:52:22 -08:00
2025-01-25 01:28:03 -08:00
2024-07-25 09:36:44 -07:00
2025-04-08 06:20:27 -07:00
2025-04-03 11:58:28 -07:00
2025-05-15 13:01:44 -07:00
2025-06-10 21:19:47 -07:00
2025-07-07 17:59:53 -07:00
2025-07-07 17:59:53 -07:00
2025-04-30 09:08:17 -07:00
2025-04-30 09:08:17 -07:00
2025-01-27 22:15:01 -08:00
2025-06-14 17:54:00 -07:00