.. |
jit
|
MoE backward improvements (#2335)
|
2025-07-07 17:59:53 -07:00 |
kernels
|
[CUDA] Implement Scan kernel (#2347)
|
2025-07-10 16:54:12 -07:00 |
allocator.cpp
|
Add memory cache to CUDA backend (#2221)
|
2025-05-30 12:12:54 -07:00 |
allocator.h
|
Add memory cache to CUDA backend (#2221)
|
2025-05-30 12:12:54 -07:00 |
binary.cpp
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
binary.h
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
CMakeLists.txt
|
MoE backward improvements (#2335)
|
2025-07-07 17:59:53 -07:00 |
compiled.cpp
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
conv.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
copy.cpp
|
fix copy dispatch (#2360)
|
2025-07-11 10:59:35 -07:00 |
custom_kernel.cpp
|
Fix unintuitive metal kernel caching (#2242)
|
2025-06-06 20:08:15 -07:00 |
device.cpp
|
[CUDA] Bundle CCCL for JIT compilation (#2357)
|
2025-07-11 18:45:37 -07:00 |
device.h
|
[CUDA] Bundle CCCL for JIT compilation (#2357)
|
2025-07-11 18:45:37 -07:00 |
distributed.cpp
|
Move common gpu primitives to backend/gpu (#2145)
|
2025-05-05 13:45:29 -07:00 |
eval.cpp
|
Generalize gpu backend (#2138)
|
2025-04-30 09:08:17 -07:00 |
event.cpp
|
Generalize gpu backend (#2138)
|
2025-04-30 09:08:17 -07:00 |
fence.cpp
|
fix input coherent kernel launch (#2153)
|
2025-05-05 17:30:50 -07:00 |
fft.cpp
|
Fix fft for integer overflow (#2161)
|
2025-05-09 14:25:12 -07:00 |
hadamard.cpp
|
Move common gpu primitives to backend/gpu (#2145)
|
2025-05-05 13:45:29 -07:00 |
indexing.cpp
|
MoE backward improvements (#2335)
|
2025-07-07 17:59:53 -07:00 |
jit_kernels.cpp
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
kernels.h
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
logsumexp.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
make_compiled_preamble.sh
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
matmul.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
matmul.h
|
Collection of refactors (#2274)
|
2025-06-13 10:44:56 -07:00 |
metal.cpp
|
Generalize gpu backend (#2138)
|
2025-04-30 09:08:17 -07:00 |
metal.h
|
Generalize gpu backend (#2138)
|
2025-04-30 09:08:17 -07:00 |
no_metal.cpp
|
start cuda circle config (#2256)
|
2025-06-10 21:19:47 -07:00 |
nojit_kernels.cpp
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
normalization.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
primitives.cpp
|
Make sliceUpdate general (#2282)
|
2025-06-12 16:48:54 -07:00 |
quantized.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
reduce.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
reduce.h
|
Reductions update (#1351)
|
2024-11-04 22:25:16 -08:00 |
resident.cpp
|
Generalize gpu backend (#2138)
|
2025-04-30 09:08:17 -07:00 |
resident.h
|
Wired (#1510)
|
2024-10-25 09:35:33 -07:00 |
rope.cpp
|
Fast primitives decide when to use the fallback (#2216)
|
2025-06-02 13:26:37 -07:00 |
scaled_dot_product_attention.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
scan.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
slicing.cpp
|
Move common gpu primitives to backend/gpu (#2145)
|
2025-05-05 13:45:29 -07:00 |
softmax.cpp
|
Add contiguous_copy_gpu util for copying array (#2379)
|
2025-07-18 06:44:25 -07:00 |
sort.cpp
|
Move common gpu primitives to backend/gpu (#2145)
|
2025-05-05 13:45:29 -07:00 |
ternary.cpp
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
ternary.h
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
unary.cpp
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
unary.h
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |
utils.cpp
|
Move some dims utils to common (#2223)
|
2025-05-29 06:48:30 -07:00 |
utils.h
|
Add Primitive::name and remove Primitive::print (#2365)
|
2025-07-14 14:06:35 -07:00 |