张壹 zhangyiss
  • Joined on 2024-09-10
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-08-02 08:46:44 +08:00
be9bc96da4 [CUDA] Matmul utils initial commit (#2441)
zhangyiss synced and deleted reference refs/tags/steel-init at zhangyiss/mlx from mirror 2025-08-02 08:46:43 +08:00
zhangyiss synced and deleted reference refs/tags/test-ci at zhangyiss/mlx from mirror 2025-08-02 08:46:43 +08:00
zhangyiss synced and deleted reference refs/tags/refs/pull/2441/merge at zhangyiss/mlx from mirror 2025-08-02 08:46:43 +08:00
zhangyiss synced commits to fix-arctan2-grads at zhangyiss/mlx from mirror 2025-08-02 08:46:43 +08:00
zhangyiss synced new reference fix-arctan2-grads to zhangyiss/mlx from mirror 2025-08-02 08:46:43 +08:00
zhangyiss synced commits to refs/pull/2401/merge at zhangyiss/mlx from mirror 2025-08-02 00:36:50 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430)
Compare 7 commits »
zhangyiss synced commits to refs/pull/1970/merge at zhangyiss/mlx from mirror 2025-08-02 00:36:49 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430)
Compare 8 commits »
zhangyiss synced commits to refs/pull/2449/head at zhangyiss/mlx from mirror 2025-08-01 16:27:00 +08:00
1712e3c2f8 Remove unneeded check
zhangyiss synced commits to refs/pull/2450/head at zhangyiss/mlx from mirror 2025-08-01 16:27:00 +08:00
5659b12730 chore: change function with a destination dictonary object
a16501fe03 revert: default destination to orignal list type if None
Compare 2 commits »
zhangyiss synced commits to refs/pull/2450/merge at zhangyiss/mlx from mirror 2025-08-01 16:27:00 +08:00
5659b12730 chore: change function with a destination dictonary object
a16501fe03 revert: default destination to orignal list type if None
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
Compare 6 commits »
zhangyiss synced commits to refs/pull/2444/head at zhangyiss/mlx from mirror 2025-08-01 16:26:59 +08:00
25ad6ab443 Instantiate only vectorized contiguous kernels
46b01279f8 Vectorize generated kernels
d32519c8ee fix gemv regression (#2445)
b405591249 fix circular reference (#2443)
3bf81ed1bd [CUDA] Quantized refactoring (#2442)
Compare 7 commits »
zhangyiss synced commits to refs/pull/2448/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:59 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
Compare 4 commits »
zhangyiss synced commits to refs/pull/2441/head at zhangyiss/mlx from mirror 2025-08-01 16:26:58 +08:00
c456d59e9f Add dynamic shared memory
1523b803f3 Early stages of the steel utils
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430)
da5912e4f2 fix custom metal extension (#2446)
daafee676f Fix wrong graph key when using concurrent context (#2447)
Compare 10 commits »
zhangyiss synced commits to refs/pull/2441/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:58 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
c456d59e9f Add dynamic shared memory
Compare 12 commits »
zhangyiss synced commits to refs/pull/2300/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:57 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430)
Compare 5 commits »
zhangyiss synced commits to refs/pull/2434/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:57 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430)
Compare 7 commits »
zhangyiss synced commits to refs/pull/2290/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:56 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430)
Compare 19 commits »
zhangyiss synced commits to refs/pull/2219/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:55 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
Compare 4 commits »
zhangyiss synced commits to refs/pull/2074/merge at zhangyiss/mlx from mirror 2025-08-01 16:26:54 +08:00
86258f292f [CUDA] Vectorize generated kernels (#2444)
b26d88591c [CUDA] Save primitive inputs faster (#2449)
86c6a15571 [CUDA] Backward convolution (#2431)
Compare 4 commits »