[CUDA] Implement DynamicSlice/DynamicSliceUpdate (#2533)

* Move DynamicSlice to gpu/primitives

* Implement compute_dynamic_offset in CUDA
This commit is contained in:
Cheng
2025-08-26 07:31:39 +09:00
committed by GitHub
parent 2ca75bb529
commit 4822c3dbe9
12 changed files with 226 additions and 134 deletions

View File

@@ -1,7 +1,6 @@
cuda_skip = {
"TestLoad.test_load_f8_e4m3",
"TestLayers.test_quantized_embedding",
"TestOps.test_dynamic_slicing",
# Block masked matmul NYI
"TestBlas.test_block_masked_matmul",
# Gather matmul NYI