张壹 zhangyiss
  • Joined on 2024-09-10
zhangyiss synced commits to refs/pull/2074/merge at zhangyiss/mlx from mirror 2025-08-21 12:06:45 +08:00
25c1e03205 Fix overflow in large filter small channels (#2520)
512281781c Remove state return from function example in compile documentation (#2518)
ac85ddfdb7 [CUDA] Add GEMM-based fallback convolution kernels (#2511)
Compare 4 commits »
zhangyiss synced commits to quantize_mode at zhangyiss/mlx from mirror 2025-08-21 12:06:44 +08:00
8216b3f51e cpu mxfp4
bca243fbaa speedup
f109d85bdd mxfp4 works
607f05d808 mxfp4 quantize/dequantize + start of optional biases
38ff386fa3 add mode parameter for quantization
Compare 11 commits »
zhangyiss synced commits to fix_power at zhangyiss/mlx from mirror 2025-08-21 12:06:43 +08:00
zhangyiss synced new reference fix_power to zhangyiss/mlx from mirror 2025-08-21 12:06:43 +08:00
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-08-21 12:06:43 +08:00
0c5fc63a36 Fix docs omission (#2524)
e397177f6e Custom cuda kernel (#2517)
f4c8888cbe [CUDA] Fix stride of singleton dims before passing to cuDNN (#2521)
Compare 3 commits »
zhangyiss synced and deleted reference refs/tags/custom-cuda-kernel at zhangyiss/mlx from mirror 2025-08-21 12:06:42 +08:00
zhangyiss synced and deleted reference refs/tags/refs/pull/2517/merge at zhangyiss/mlx from mirror 2025-08-21 12:06:42 +08:00
zhangyiss synced and deleted reference refs/tags/refs/pull/2521/merge at zhangyiss/mlx from mirror 2025-08-21 12:06:42 +08:00
zhangyiss synced commits to refs/pull/2521/merge at zhangyiss/mlx from mirror 2025-08-21 02:46:47 +08:00
25c1e03205 Fix overflow in large filter small channels (#2520)
Compare 2 commits »
zhangyiss synced commits to refs/pull/2476/merge at zhangyiss/mlx from mirror 2025-08-21 02:46:46 +08:00
3bb6b1d44a added get_device to do reductions on the cpu if metal
25c1e03205 Fix overflow in large filter small channels (#2520)
4ee0d0bb55 removed nproc-per-node
cd53eb1ae3 dispatch types with dtype_utils
Compare 6 commits »
zhangyiss synced commits to refs/pull/2517/merge at zhangyiss/mlx from mirror 2025-08-21 02:46:46 +08:00
25c1e03205 Fix overflow in large filter small channels (#2520)
Compare 2 commits »
zhangyiss synced commits to refs/pull/2401/merge at zhangyiss/mlx from mirror 2025-08-21 02:46:45 +08:00
25c1e03205 Fix overflow in large filter small channels (#2520)
512281781c Remove state return from function example in compile documentation (#2518)
ac85ddfdb7 [CUDA] Add GEMM-based fallback convolution kernels (#2511)
65d0d40232 Split cuDNN helpers into a separate header (#2491)
Compare 6 commits »
zhangyiss synced commits to refs/pull/2476/head at zhangyiss/mlx from mirror 2025-08-21 02:46:45 +08:00
3bb6b1d44a added get_device to do reductions on the cpu if metal
4ee0d0bb55 removed nproc-per-node
cd53eb1ae3 dispatch types with dtype_utils
f7c11b965e Merge branch 'main' into nccl_backend
512281781c Remove state return from function example in compile documentation (#2518)
Compare 24 commits »
zhangyiss synced commits to refs/pull/2290/merge at zhangyiss/mlx from mirror 2025-08-21 02:46:44 +08:00
512281781c Remove state return from function example in compile documentation (#2518)
ac85ddfdb7 [CUDA] Add GEMM-based fallback convolution kernels (#2511)
65d0d40232 Split cuDNN helpers into a separate header (#2491)
Compare 4 commits »
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-08-21 02:46:42 +08:00
25c1e03205 Fix overflow in large filter small channels (#2520)
zhangyiss synced and deleted reference refs/tags/refs/pull/2520/merge at zhangyiss/mlx from mirror 2025-08-21 02:46:41 +08:00
zhangyiss synced and deleted reference refs/tags/fix-conv-large-filter at zhangyiss/mlx from mirror 2025-08-21 02:46:40 +08:00
zhangyiss synced commits to refs/pull/2517/merge at zhangyiss/mlx from mirror 2025-08-20 18:39:52 +08:00
512281781c Remove state return from function example in compile documentation (#2518)
d6b204b528 comments
fa56bf2feb Remove completion handler from custom kernel
39dbd92df5 Make threadgroup size less or equal to grid size
Compare 17 commits »
zhangyiss synced commits to refs/pull/2499/merge at zhangyiss/mlx from mirror 2025-08-20 18:39:51 +08:00
03217596d2 Merge e6437b7dd8df2a381ad5815f78e1ef5f9b3b1ba9 into 512281781c
512281781c Remove state return from function example in compile documentation (#2518)
ac85ddfdb7 [CUDA] Add GEMM-based fallback convolution kernels (#2511)
Compare 3 commits »
zhangyiss synced commits to refs/pull/2517/head at zhangyiss/mlx from mirror 2025-08-20 18:39:51 +08:00
d6b204b528 comments
fa56bf2feb Remove completion handler from custom kernel
39dbd92df5 Make threadgroup size less or equal to grid size
432c02dabc Typo in test
fa555c536a Remove regex
Compare 22 commits »