Default Branch

c5460762e7 · Fix AdamW weight_decay default value in docstring (#2557) · Updated 2025-09-01 12:29:30 +08:00

Branches

5d74921cf2 · add sdpa with sinks · Updated 2025-09-01 01:59:50 +08:00

1
1

57a4334bbc · rebase · Updated 2025-08-30 01:13:50 +08:00

1
68

4987e7615a · Improve the cutlass gemm · Updated 2025-08-26 09:18:19 +08:00

28
8

400f8457ea · Experimenting with a gemm based on the cuda steel utils · Updated 2025-08-15 02:27:50 +08:00

46
1

a22d0bf273 · Add stricter condition to matrix sdpa · Updated 2025-08-07 10:51:14 +08:00

56
8
qmm

8269c9d02d · Support unaligned M · Updated 2025-07-23 15:40:27 +08:00

108
6

a9c720e8cd · Improve the ring backend initialization · Updated 2025-07-12 06:31:28 +08:00

132
1

870208eff5 · Start sdpa vector · Updated 2025-06-17 08:38:39 +08:00

167
1
fft

83762691ba · Fix four step fft · Updated 2025-05-09 05:14:59 +08:00    zhangyiss

245
6

7c99acb799 · split logsumexp · Updated 2025-05-07 08:10:14 +08:00    zhangyiss

246
1

998404ada4 · Get trellis to run · Updated 2025-04-26 22:02:20 +08:00    zhangyiss

285
3

11f73d6e89 · Double buffer keys for vector sdpa · Updated 2025-04-22 15:19:11 +08:00    zhangyiss

274
1

4c46e17a5d · Update benchmark output · Updated 2025-04-16 01:50:06 +08:00    zhangyiss

284
1

67ec27d515 · synch before reading memory in test · Updated 2025-04-08 05:37:32 +08:00    zhangyiss

298
4

066336b60e · load q4_k from gguf · Updated 2025-04-04 01:56:12 +08:00    zhangyiss

308
1

688e421184 · only interrupt during an eval · Updated 2025-03-19 22:56:26 +08:00    zhangyiss

355
2

127de8821e · Fix the sig_handler check · Updated 2025-03-08 09:31:06 +08:00    zhangyiss

369
2

c5073fc452 · Ensure we only have one copy of the fence · Updated 2025-03-05 15:37:15 +08:00    zhangyiss

378
3

4c1dfa58b7 · xor op on arrays (#1875) · Updated 2025-02-17 16:24:53 +08:00    zhangyiss

405
0
Included

4515866024 · Change the linux test to ubuntu 24.04 · Updated 2025-01-21 14:58:05 +08:00    zhangyiss

459
9