zhangyiss/mlx - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Author	SHA1	Message	Date
Cheng	940f4c7818	Fix building with CUDA < 12.8 (#2782 ) Some checks failed Build and Test / check_lint (push) Has been cancelled Details Build and Test / linux_build_and_test (ubuntu-22.04) (push) Has been cancelled Details Build and Test / linux_build_and_test (ubuntu-22.04-arm) (push) Has been cancelled Details Build and Test / mac_build_and_test (14.0) (push) Has been cancelled Details Build and Test / mac_build_and_test (15.0) (push) Has been cancelled Details Build and Test / cuda_build_and_test (cuda-12.6) (push) Has been cancelled Details Build and Test / cuda_build_and_test (cuda-12.9) (push) Has been cancelled Details Build and Test / build_documentation (push) Has been cancelled Details Build and Test / Linux Fedora CPP Build (aarch64) (push) Has been cancelled Details Build and Test / Linux Fedora CPP Build (x86_64) (push) Has been cancelled Details Nightly Build / build_linux_release (3.10) (push) Has been cancelled Details Nightly Build / build_linux_release (3.14) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.11, ubuntu-22.04-arm) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.12, ubuntu-22.04-arm) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.13, ubuntu-22.04-arm) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.14, ubuntu-22.04-arm) (push) Has been cancelled Details Nightly Build / build_mac_release (3.10) (push) Has been cancelled Details Nightly Build / build_mac_release (3.13) (push) Has been cancelled Details Nightly Build / build_cuda_release (push) Has been cancelled Details	2025-11-18 12:55:19 +09:00
Awni Hannun	df58b4133a	[CUDA] Reduce use of managed memory (#2725 ) Some checks failed Nightly Build / build_linux_release (3.10) (push) Has been cancelled Details Nightly Build / build_linux_release (3.14) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.10) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.11) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.12) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.13) (push) Has been cancelled Details Nightly Build / build_linux_with_tests (3.14) (push) Has been cancelled Details Nightly Build / build_mac_release (3.10) (push) Has been cancelled Details Nightly Build / build_mac_release (3.13) (push) Has been cancelled Details Nightly Build / build_cuda_with_tests (push) Has been cancelled Details Nightly Build / build_cuda_release (push) Has been cancelled Details Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled Details Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled Details * Use async cuda malloc managed with cuda 13 * add pool threshold * refactor for regular cuda malloc * load eval gpu for cuda * remove use of cuda pool, use cuda free async * fix * fix * fix * fix * fix + comment	2025-11-05 16:05:23 -08:00
Daniel Yeh	22a5da76c8	Faster complex matmul (#2571 )	2025-10-02 23:33:15 -07:00
Cheng	6a3acf2301	[CUDA] Set bias as input when using bias epilogue (#2584 )	2025-09-11 15:31:09 +09:00
Cheng	44cc5da4bc	[CUDA] Fix alpha not respected when using bias epilogue (#2578 )	2025-09-10 09:08:01 +09:00
Cheng	dde3682b69	[CUDA] Use GEMM with epilogue instead of AddMM (#2569 )	2025-09-09 13:18:49 +09:00
Cheng	ac85ddfdb7	[CUDA] Add GEMM-based fallback convolution kernels (#2511 ) * Add gemm_conv * Add gemm_grouped_conv	2025-08-20 10:06:22 +09:00
Cheng	dfb5022eab	Rename cu::Matmul to CublasGemm (#2488 )	2025-08-13 09:37:40 +09:00
Cheng	aa7b47481a	[CUDA] Optimize set_mm_device_pointers for small ndim (#2473 )	2025-08-08 15:23:30 +09:00
Awni Hannun	7bb96e4249	fix cublas on h100 (#2466 )	2025-08-06 06:18:58 -07:00
Angelos Katharopoulos	be9bc96da4	[CUDA] Matmul utils initial commit (#2441 )	2025-08-01 14:22:25 -07:00
Awni Hannun	d32519c8ee	fix gemv regression (#2445 )	2025-07-30 14:23:01 -07:00
Awni Hannun	ef631d63af	faster rms norm (#2433 )	2025-07-29 13:12:00 -07:00
Awni Hannun	ab0e608862	[CUDA] More sizes for gemv (#2429 ) * route more to gemv * route more sizes to custom gemv	2025-07-28 12:35:01 -07:00
Awni Hannun	9acec364c2	[CUDA] Always use batched matmul (#2404 ) * cuda batched mm * addmm as well * comment	2025-07-24 20:46:02 -07:00

15 Commits