mlx/mlx/backend/cuda/ternary.cu at b2273733eaa0c08e4c4b51f93c76ee1710066967

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Files

Cheng 85873cb162 [CUDA] Do vectorized store/load in contiguous elementwise ops (#2342 )

* Do vectorized store/load in unary ops

* Do vectorized store/load in binary_two ops

* Do vectorized store/load in copy ops

* Do vectorized store/load in ternary ops

* Use int32_t for IdxT

* binary => binary_two in binary_two.cu

* Fix tests on large arrays

* Use uint as index type

* Contig uses uint as index and non-contig uses int

2025-07-09 18:48:43 -07:00

6.5 KiB

Raw Blame History

View Raw

6.5 KiB Raw Blame History

6.5 KiB

Raw Blame History