Cheng
|
cb349a291c
|
[CUDA] Use cuda::std::complex in place of cuComplex (#2372)
|
2025-07-15 00:36:13 -07:00 |
|
Cheng
|
6325f60d52
|
[CUDA] Bundle CCCL for JIT compilation (#2357)
* Ship CCCL for JIT compilation
* Remove cexpf
|
2025-07-11 18:45:37 -07:00 |
|
Cheng
|
8347575ba1
|
[CUDA] Implement Scan kernel (#2347)
* Contiguous scan
* Strided scan
* Enable tests
* Fix failing logaddexp test
* Use cexpf in Metal
|
2025-07-10 16:54:12 -07:00 |
|
Awni Hannun
|
dd4f53db63
|
use fp32 for testing, add more complex ops (#2322)
|
2025-07-01 07:30:00 -07:00 |
|
Awni Hannun
|
c552ff2451
|
[CUDA] Fix back-end bugs and enable corresponding tests (#2296)
* Fix some cuda back-end bugs and enable corresponding tests
* more fixes
* enable more tests
* format
|
2025-06-16 08:45:40 -07:00 |
|
Cheng
|
a4fc671d3e
|
CUDA backend: compile (#2276)
* CUDA backend: compile
* Rename kernels/ to device/
|
2025-06-12 17:08:39 -07:00 |
|