Gather qmm batched kernel and refactoring of quantized (#2078)

This commit is contained in:
Angelos Katharopoulos
2025-04-17 13:53:11 -07:00
committed by GitHub
parent 99eefd2ec0
commit 5de6d94a90
15 changed files with 1479 additions and 449 deletions

View File

@@ -1352,6 +1352,7 @@ array gather_qmm(
bool transpose = true,
int group_size = 64,
int bits = 4,
bool sorted_indices = false,
StreamOrDevice s = {});
/** Returns a contraction of a and b over multiple dimensions. */