Commit Graph

5 Commits

Author SHA1 Message Date
Awni Hannun
1e496ddb82
[CUDA] Simplify allocator (#2392)
* simplify allocator and fixe race with small pool

* Don't use shared event in worker

* use cuda buffer in small pool

* comment

* comment
2025-07-22 08:24:01 -07:00
Awni Hannun
93d70419e7
[CUDA] speedup handling scalars (#2389)
* speedup scalars in cuda

* comment
2025-07-18 21:47:31 -07:00
Cheng
5685ceb3c7
Avoid invoking allocator::malloc when creating CUDA event (#2232) 2025-06-03 16:48:40 -07:00
Cheng
db5a7c6192
Add memory cache to CUDA backend (#2221)
* Move BufferCache out of allocator

* Add memory cache to cuda backend allocator

* Simplify BufferCache assuming buf can not be null
2025-05-30 12:12:54 -07:00
Cheng
0cae0bdac8
CUDA backend: backbone (#2075) 2025-05-06 21:26:46 -07:00