mlx/mlx
Arkar Min Aung cb4dc59a9e feat(benchmarks): add comprehensive SVD performance benchmarks
Add benchmarks for Metal SVD implementation as required by CONTRIBUTING.md:
- Square matrix benchmarks (64x64 to 512x512)
- Rectangular matrix benchmarks
- Batched matrix benchmarks
- CPU vs GPU performance comparison
- Special matrices (identity, diagonal, zero)

Benchmarks validate performance improvements from GPU acceleration
and help identify performance regressions in future changes.

Usage:
  python benchmarks/python/svd_bench.py --gpu
  python benchmarks/python/svd_bench.py --compare
  python benchmarks/python/svd_bench.py --all
2025-06-15 18:09:11 +10:00
..
3rdparty jagrit's commit files 2023-11-29 10:52:08 -08:00
backend feat(benchmarks): add comprehensive SVD performance benchmarks 2025-06-15 18:09:11 +10:00
distributed Make sliceUpdate general (#2282) 2025-06-12 16:48:54 -07:00
io Remove static initializers (#2059) 2025-04-24 06:14:49 -07:00
types fix pinv (#2110) 2025-04-23 13:08:28 -07:00
allocator.cpp Add stats and limit to common allocator and enable tests (#1988) 2025-03-21 12:28:36 -07:00
allocator.h Add stats and limit to common allocator and enable tests (#1988) 2025-03-21 12:28:36 -07:00
array.cpp reduce binary size (#1952) 2025-03-11 06:30:44 -07:00
array.h Add complex eigh (#2191) 2025-05-18 00:18:43 -07:00
CMakeLists.txt start cuda circle config (#2256) 2025-06-10 21:19:47 -07:00
compile_impl.h Simplify removes no-ops from the tape (#1759) 2025-01-09 11:23:19 -08:00
compile.cpp Share more common code in Compiled (#2240) 2025-06-03 16:48:50 -07:00
compile.h fix function pointer (#1865) 2025-02-13 18:46:11 -08:00
device.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
device.h Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
dtype_utils.cpp Introduce macros for dispatching dynamic dtypes as static types (#2073) 2025-04-19 06:16:30 -07:00
dtype_utils.h Introduce macros for dispatching dynamic dtypes as static types (#2073) 2025-04-19 06:16:30 -07:00
dtype.cpp fix double type promotion (#1901) 2025-02-25 06:00:53 -08:00
dtype.h Fp64 on the CPU (#1843) 2025-02-07 15:52:22 -08:00
einsum.cpp Einsum ellipsis (#1788) 2025-01-25 01:28:03 -08:00
einsum.h Einsum (#1269) 2024-07-25 09:36:44 -07:00
event.h Remove Event::Signal() (#2052) 2025-04-08 06:20:27 -07:00
export_impl.h Export / import functions to / from a file (#1642) 2024-12-24 11:19:13 -08:00
export.cpp fix export to work with gather/scatter axis (#2263) 2025-06-09 20:37:27 -07:00
export.h Use unordered map for kwargs in export/import (#2087) 2025-04-21 07:17:22 -07:00
fast_primitives.h Fast primitives decide when to use the fallback (#2216) 2025-06-02 13:26:37 -07:00
fast.cpp Fix unintuitive metal kernel caching (#2242) 2025-06-06 20:08:15 -07:00
fast.h Add new sdpa function overload (#2035) 2025-04-03 11:58:28 -07:00
fence.h redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
fft.cpp add fftshift and ifftshift fft helpers (#2135) 2025-04-29 22:13:45 -07:00
fft.h add fftshift and ifftshift fft helpers (#2135) 2025-04-29 22:13:45 -07:00
graph_utils.cpp Optionally specify names for arrays when exporting (#1749) 2025-01-06 13:07:46 -08:00
graph_utils.h Optionally specify names for arrays when exporting (#1749) 2025-01-06 13:07:46 -08:00
io.h Added missing unordered_map includes (#1635) 2024-12-02 07:03:03 -08:00
linalg.cpp feat(metal): implement complete Metal SVD with Jacobi algorithm 2025-06-15 17:44:38 +10:00
linalg.h non-symmetric eig and eigh (#2188) 2025-05-15 13:01:44 -07:00
memory.h move memory APIs into top level mlx.core (#1982) 2025-03-21 07:25:12 -07:00
mlx.h start cuda circle config (#2256) 2025-06-10 21:19:47 -07:00
ops.cpp Optimizing Complex Matrix Multiplication using Karatsuba’s Algorithm (#2220) 2025-06-02 15:58:46 -07:00
ops.h Fix typos (#2136) 2025-04-29 07:26:05 -07:00
primitives.cpp reduce vjp for all and any (#2193) 2025-05-16 08:38:49 -07:00
primitives.h fix conv export (#2265) 2025-06-10 09:34:01 -07:00
random.cpp Add random normal distribution for complex numbers (#2182) 2025-05-13 22:43:45 -07:00
random.h Add random normal distribution for complex numbers (#2182) 2025-05-13 22:43:45 -07:00
scheduler.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
scheduler.h Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
stream.h Export / import functions to / from a file (#1642) 2024-12-24 11:19:13 -08:00
threadpool.h Ring distributed backend (#1784) 2025-01-27 22:15:01 -08:00
transforms_impl.h Remove static initializers (#2059) 2025-04-24 06:14:49 -07:00
transforms.cpp Perf regression fix (#2243) 2025-06-03 17:55:12 -07:00
transforms.h Export / import functions to / from a file (#1642) 2024-12-24 11:19:13 -08:00
utils.cpp fix pinv (#2110) 2025-04-23 13:08:28 -07:00
utils.h fix pinv (#2110) 2025-04-23 13:08:28 -07:00
version.cpp Do not define MLX_VERSION globally (#1966) 2025-03-18 07:12:40 -07:00
version.h Perf regression fix (#2243) 2025-06-03 17:55:12 -07:00