张壹 zhangyiss
  • Joined on 2024-09-10
zhangyiss synced commits to refs/pull/2483/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:48 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
Compare 2 commits »
zhangyiss synced commits to refs/pull/2401/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:47 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
ac207ce7aa make code blocks copyable (#2480)
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 5 commits »
zhangyiss synced commits to refs/pull/2472/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:47 +08:00
eef0f23b5e Merge 7ebb9fe6683b3a431edd2e098bce77bbfd49ab5e into dfb5022eab
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
ac207ce7aa make code blocks copyable (#2480)
Compare 3 commits »
zhangyiss synced commits to refs/pull/2476/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:47 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
Compare 2 commits »
zhangyiss synced commits to refs/pull/2482/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:47 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
ac207ce7aa make code blocks copyable (#2480)
Compare 3 commits »
zhangyiss synced commits to refs/pull/2300/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:46 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
Compare 2 commits »
zhangyiss synced and deleted reference refs/tags/refs/pull/2484/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:45 +08:00
zhangyiss synced commits to refs/pull/2074/merge at zhangyiss/mlx from mirror 2025-08-13 20:56:45 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
ac207ce7aa make code blocks copyable (#2480)
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 5 commits »
zhangyiss synced commits to simple-gemm at zhangyiss/mlx from mirror 2025-08-13 12:46:41 +08:00
4149576625 Add a cutlass gemm
6592af4d45 More pipelining for the sm_80 gemm
8997322e47 Improve gemm
9163efb0e0 Remove duplicate register tile
02369ba493 Simple gemm example
Compare 10 commits »
zhangyiss synced commits to refs/pull/2300/merge at zhangyiss/mlx from mirror 2025-08-13 12:46:41 +08:00
ac207ce7aa make code blocks copyable (#2480)
Compare 2 commits »
zhangyiss synced commits to refs/pull/2476/merge at zhangyiss/mlx from mirror 2025-08-13 12:46:41 +08:00
ac207ce7aa make code blocks copyable (#2480)
Compare 2 commits »
zhangyiss synced commits to refs/pull/2484/merge at zhangyiss/mlx from mirror 2025-08-13 12:46:41 +08:00
ac207ce7aa make code blocks copyable (#2480)
Compare 2 commits »
zhangyiss synced and deleted reference refs/tags/refs/pull/2488/merge at zhangyiss/mlx from mirror 2025-08-13 12:46:40 +08:00
zhangyiss synced commits to custom-cuda-kernel at zhangyiss/mlx from mirror 2025-08-13 12:46:40 +08:00
2a73000639 Working custom kernels jointly
70fefcf8e5 Add custom kernel for CUDA
ac207ce7aa make code blocks copyable (#2480)
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 7 commits »
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-08-13 12:46:40 +08:00
dfb5022eab Rename cu::Matmul to CublasGemm (#2488)
zhangyiss synced commits to refs/pull/2483/merge at zhangyiss/mlx from mirror 2025-08-13 04:36:50 +08:00
ac207ce7aa make code blocks copyable (#2480)
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 4 commits »
zhangyiss synced commits to refs/pull/2486/merge at zhangyiss/mlx from mirror 2025-08-13 04:36:50 +08:00
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 3 commits »
zhangyiss synced commits to refs/pull/2300/merge at zhangyiss/mlx from mirror 2025-08-13 04:36:49 +08:00
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 3 commits »
zhangyiss synced commits to refs/pull/2476/merge at zhangyiss/mlx from mirror 2025-08-13 04:36:49 +08:00
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 3 commits »
zhangyiss synced commits to refs/pull/2482/merge at zhangyiss/mlx from mirror 2025-08-13 04:36:49 +08:00
fce53b61d6 Fix reduce sum/prod overflow (#2477)
8ae4a76308 Use CMake <4.1 to avoid the nvpl error (#2489)
Compare 3 commits »