Faster indexing math in a few kernels (#1589)

* wip: faster compiled kernels

* faster general unary with uint specialization

* index type in compiled, unary, binary, ternary, copy

* fix jit

* jit fix

* specialize gather + scatter

* nit in docs
This commit is contained in:
Awni Hannun
2024-11-18 19:52:00 -08:00
committed by GitHub
parent bf481e8e5d
commit 2419edd5b2
25 changed files with 630 additions and 484 deletions

View File

@@ -279,7 +279,7 @@ void Compiled::eval_cpu(
// Figure out which kernel we are using
auto& shape = outputs[0].shape();
bool contiguous = compiled_check_contiguity(inputs, shape);
auto contiguous = compiled_check_contiguity(inputs, shape);
// Handle all broadcasting and collect function input arguments
std::vector<void*> args;