Split cuDNN helpers into a separate header (#2491)

* Add RAII managed CudaGraph class

* Implement forward rms_norm with cuDNN

* Revert back to old rms norm kernel
This commit is contained in:
Cheng
2025-08-20 09:29:28 +09:00
committed by GitHub
parent cea9369610
commit 65d0d40232
8 changed files with 527 additions and 302 deletions

View File

@@ -17,6 +17,7 @@ target_sources(
${CMAKE_CURRENT_SOURCE_DIR}/copy/copy_general_input.cu
${CMAKE_CURRENT_SOURCE_DIR}/conv.cpp
${CMAKE_CURRENT_SOURCE_DIR}/cuda.cpp
${CMAKE_CURRENT_SOURCE_DIR}/cudnn_utils.cpp
${CMAKE_CURRENT_SOURCE_DIR}/device.cpp
${CMAKE_CURRENT_SOURCE_DIR}/eval.cpp
${CMAKE_CURRENT_SOURCE_DIR}/event.cu