mlx/python at cuda-reduce - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-06-24 01:17:26 +08:00

History

Angelos Katharopoulos ab7c310914 Adapt the torch benchmark to run in CUDA		2025-06-20 21:49:15 -07:00
..
blas	Update GEMM (#424 )	2024-01-17 12:42:39 -08:00
comparative	Adapt the torch benchmark to run in CUDA	2025-06-20 21:49:15 -07:00
batch_matmul_bench.py	Add isort pre-commit and run (#68 )	2023-12-08 11:31:47 -08:00
compile_bench.py	Add softmin, hardshrink, hardtanh (#1180 )	2024-06-04 15:48:18 -07:00
conv1d_bench.py	Add groups to Conv1d (#948 )	2024-04-27 06:24:57 -07:00
conv2d_bench_cpu.py	Conv cpu improvements (#1410 )	2024-09-15 18:45:10 -07:00
conv2d_train_bench_cpu.py	Conv cpu improvements (#1410 )	2024-09-15 18:45:10 -07:00
conv2d_transpose_bench_cpu.py	Conv cpu improvements (#1410 )	2024-09-15 18:45:10 -07:00
conv3d_bench_cpu.py	Conv cpu improvements (#1410 )	2024-09-15 18:45:10 -07:00
conv3d_train_bench_cpu.py	Conv cpu improvements (#1410 )	2024-09-15 18:45:10 -07:00
conv3d_transpose_bench_cpu.py	Conv cpu improvements (#1410 )	2024-09-15 18:45:10 -07:00
conv_bench.py	Add softmin, hardshrink, hardtanh (#1180 )	2024-06-04 15:48:18 -07:00
conv_transpose_bench.py	Transposed Convolution (#1245 )	2024-09-06 19:52:38 -07:00
conv_unaligned_bench.py	Add load_safe to the general conv loaders (#2258 )	2025-06-10 20:58:16 -07:00
distributed_bench.py	MPI ops in GPU stream for faster comms (#1356 )	2024-08-26 15:12:50 -07:00
einsum_bench.py	Einsum (#1269 )	2024-07-25 09:36:44 -07:00
fft_bench.py	Feature complete Metal FFT (#1102 )	2024-06-06 12:57:25 -07:00
gather_bench.py	Remove unused modules (#1949 )	2025-03-10 06:05:26 -07:00
gather_mm_bench.py	Gather qmm batched kernel and refactoring of quantized (#2078 )	2025-04-17 13:53:11 -07:00
gather_qmm_bench.py	Gather qmm batched kernel and refactoring of quantized (#2078 )	2025-04-17 13:53:11 -07:00
hadamard_bench.py	Fast Hadamard Transform (#1249 )	2024-07-09 20:39:01 -07:00
layer_norm_bench.py	Change layernorms to two pass algorithm (#2246 )	2025-06-06 13:34:56 -07:00
rms_norm_bench.py	RMS norm without scaling (#1915 )	2025-02-28 20:26:57 -08:00
rope_bench.py	Fix copy donation and add partial rope (#881 )	2024-03-22 17:28:26 -07:00
scatter_bench.py	improvements to scatter / gather (#1541 )	2024-10-30 19:30:54 -07:00
sdpa_bench.py	Support fused masking in Attention (#1924 )	2025-03-20 11:01:32 -07:00
sdpa_vector_bench.py	Allow different value dimensions in sdpa_vector (#1811 )	2025-01-31 20:58:59 -08:00
single_ops.py	Propagate nans in binary ops (#579 )	2024-01-29 11:19:38 -08:00
synchronize_bench.py	Faster synchronization `Fence` primitive (#1773 )	2025-01-17 18:42:19 -08:00
time_utils.py	Shapeless compilation for some graphs (#687 )	2024-02-19 21:43:54 -08:00