mlx/mlx/backend/common
Angelos Katharopoulos 580776559b
RoPE for CUDA (#2293)
* First working CUDA rope

* Fix random
2025-06-15 06:08:07 -07:00
..
binary.h fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
broadcasting.cpp Gather mm new kernel and small refactoring (#2040) 2025-04-14 16:37:36 -07:00
broadcasting.h Gather mm new kernel and small refactoring (#2040) 2025-04-14 16:37:36 -07:00
buffer_cache.h Add memory cache to CUDA backend (#2221) 2025-05-30 12:12:54 -07:00
CMakeLists.txt Gather mm new kernel and small refactoring (#2040) 2025-04-14 16:37:36 -07:00
common.cpp Gather mm new kernel and small refactoring (#2040) 2025-04-14 16:37:36 -07:00
compiled.cpp Share more common code in Compiled (#2240) 2025-06-03 16:48:50 -07:00
compiled.h Share more common code in Compiled (#2240) 2025-06-03 16:48:50 -07:00
copy.h CUDA backend: unary ops (#2158) 2025-06-09 06:45:08 -07:00
hadamard.h GPU Hadamard for large N (#1879) 2025-05-01 17:19:17 -07:00
load.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
matmul.h CUDA backend: matmul (#2241) 2025-06-06 12:24:04 -07:00
reduce.cpp Refactor common into cpu specific and truly common (#1817) 2025-02-03 15:58:02 -08:00
reduce.h Refactor common into cpu specific and truly common (#1817) 2025-02-03 15:58:02 -08:00
slicing.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
slicing.h Fix a couple of slicing bugs (#1827) 2025-02-05 19:50:08 -08:00
ternary.h fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
unary.h CUDA backend: unary ops (#2158) 2025-06-09 06:45:08 -07:00
utils.cpp RoPE for CUDA (#2293) 2025-06-15 06:08:07 -07:00
utils.h RoPE for CUDA (#2293) 2025-06-15 06:08:07 -07:00