张壹 zhangyiss
  • Joined on 2024-09-10
zhangyiss synced commits to pre-commit-ci-update-config at zhangyiss/the-littlest-jupyterhub from mirror 2025-11-04 05:28:08 +08:00
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-11-04 02:28:14 +08:00
1ff2b713b6 Check isnan in maximum / minimum with CPU backend (#2652)
50514a6146 Set up publishing to PyPI and Test-PyPI (#2721)
93d76b0f30 Fix compile multi capture (#2678)
78678de0cd add null check -- the bundleIdentifier is optional (#2709)
ed9c6b1117 update: add linux fedora container CI - CPP build test only (#2722)
Compare 5 commits »
zhangyiss synced and deleted reference refs/tags/fix_compiel_multi_capture at zhangyiss/mlx from mirror 2025-11-04 02:28:13 +08:00
zhangyiss synced commits to async_cuda_malloc at zhangyiss/mlx from mirror 2025-11-04 02:28:13 +08:00
742033fefe remove use of cuda pool, use cuda free async
zhangyiss synced commits to async_cuda_malloc at zhangyiss/mlx from mirror 2025-11-02 09:18:11 +08:00
c27a0647a3 load eval gpu for cuda
zhangyiss synced commits to async_cuda_malloc at zhangyiss/mlx from mirror 2025-11-02 01:08:11 +08:00
190f4cc19a load eval gpu for cuda
zhangyiss synced new reference sign-warns to zhangyiss/mlx from mirror 2025-11-01 08:38:16 +08:00
zhangyiss synced commits to sign-warns at zhangyiss/mlx from mirror 2025-11-01 08:38:15 +08:00
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-11-01 08:38:14 +08:00
39b04ce638 use faster dequant for fp4 qmv (#2720)
zhangyiss synced commits to async_cuda_malloc at zhangyiss/mlx from mirror 2025-11-01 08:38:13 +08:00
d378567cc6 refactor for regular cuda malloc
b84fc978d3 add pool threshold
764b4b7ce8 Use async cuda malloc managed with cuda 13
74c1ed25bb Migrate CircleCI to GitHub Actions (#2716)
ec72b44417 Add quantize/dequantize for mxfp8 and nvfp4 (#2688)
Compare 12 commits »
zhangyiss synced and deleted reference refs/tags/no_fp4_lut at zhangyiss/mlx from mirror 2025-11-01 08:38:12 +08:00
zhangyiss synced commits to no_fp4_lut at zhangyiss/mlx from mirror 2025-11-01 00:28:11 +08:00
9aa8483adf use faster dequant for fp4 qmv
zhangyiss synced commits to no_fp4_lut at zhangyiss/mlx from mirror 2025-10-31 16:21:28 +08:00
zhangyiss synced new reference no_fp4_lut to zhangyiss/mlx from mirror 2025-10-31 16:21:28 +08:00
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-10-31 16:21:27 +08:00
d9e6349657 fix docs path (#2719)
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-10-31 07:58:14 +08:00
b901a9f311 Fix the order of hosts in the ring (#2718)
68c5fa1c95 fix memory count bug (#2717)
793a31eeb6 Fix missing domain_uuid_key in thunderbolt ring setup (#2682)
74c1ed25bb Migrate CircleCI to GitHub Actions (#2716)
Compare 4 commits »
zhangyiss synced and deleted reference refs/tags/mxfp8_and_nvfp4 at zhangyiss/mlx from mirror 2025-10-29 14:58:11 +08:00
zhangyiss synced commits to main at zhangyiss/mlx from mirror 2025-10-29 14:58:11 +08:00
ec72b44417 Add quantize/dequantize for mxfp8 and nvfp4 (#2688)
zhangyiss synced commits to mxfp8_and_nvfp4 at zhangyiss/mlx from mirror 2025-10-29 06:53:18 +08:00
83062b70e4 fix output type of mxfp4 matmuls
5a043fd793 improve quant docs
Compare 2 commits »
zhangyiss synced commits to mxfp8_and_nvfp4 at zhangyiss/mlx from mirror 2025-10-28 22:28:11 +08:00
fdacaa3ab9 improve quant docs
94fe5114fa add default bits and group sizes
Compare 2 commits »