mlx/mlx/backend/metal
Arkar Min Aung 6d01528e90 feat: Add benchmarking and documentation updates for Metal SVD
- Add comprehensive SVD benchmark script (benchmarks/python/svd_benchmark.py):
  * Performance comparison between CPU and GPU implementations
  * Batch processing benchmarks
  * Correctness verification tests
  * Detailed timing and speedup analysis

- Update linalg documentation to mention Metal GPU acceleration

- Add implementation summary document for development reference

This addresses CONTRIBUTING.md requirements:
- Benchmarks for efficiency impact measurement (point 3)
- Documentation updates for API changes (point 4)
- Comprehensive testing coverage (point 2)
2025-06-14 17:28:19 +10:00
..
jit feat: Add Metal SVD infrastructure and parameter structures 2025-06-13 23:28:52 +10:00
kernels feat: Add benchmarking and documentation updates for Metal SVD 2025-06-14 17:28:19 +10:00
allocator.cpp Add memory cache to CUDA backend (#2221) 2025-05-30 12:12:54 -07:00
allocator.h Add memory cache to CUDA backend (#2221) 2025-05-30 12:12:54 -07:00
binary.cpp Improve metal elementwise kernels (#2247) 2025-06-06 11:37:40 -07:00
binary.h Fixes for large arrays with a few ops (#1299) 2024-07-30 17:18:39 -07:00
CMakeLists.txt feat: Add Metal SVD infrastructure and parameter structures 2025-06-13 23:28:52 +10:00
compiled.cpp Improve metal elementwise kernels (#2247) 2025-06-06 11:37:40 -07:00
conv.cpp Collection of refactors (#2274) 2025-06-13 10:44:56 -07:00
copy.cpp Improve metal elementwise kernels (#2247) 2025-06-06 11:37:40 -07:00
custom_kernel.cpp Fix unintuitive metal kernel caching (#2242) 2025-06-06 20:08:15 -07:00
device.cpp Collection of refactors (#2274) 2025-06-13 10:44:56 -07:00
device.h Collection of refactors (#2274) 2025-06-13 10:44:56 -07:00
distributed.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
eval.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
event.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
fence.cpp fix input coherent kernel launch (#2153) 2025-05-05 17:30:50 -07:00
fft.cpp Fix fft for integer overflow (#2161) 2025-05-09 14:25:12 -07:00
hadamard.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
indexing.cpp Add remove_index utility (#2173) 2025-05-13 17:09:56 -07:00
jit_kernels.cpp feat: Implement basic one-sided Jacobi SVD algorithm in Metal 2025-06-14 17:05:10 +10:00
kernels.h feat: Add Metal SVD infrastructure and parameter structures 2025-06-13 23:28:52 +10:00
logsumexp.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
make_compiled_preamble.sh Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
matmul.cpp Collection of refactors (#2274) 2025-06-13 10:44:56 -07:00
matmul.h Collection of refactors (#2274) 2025-06-13 10:44:56 -07:00
metal.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
metal.h Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
no_metal.cpp start cuda circle config (#2256) 2025-06-10 21:19:47 -07:00
nojit_kernels.cpp Add load_safe to the general conv loaders (#2258) 2025-06-10 20:58:16 -07:00
normalization.cpp Collection of refactors (#2274) 2025-06-13 10:44:56 -07:00
primitives.cpp feat: Add Metal SVD infrastructure and parameter structures 2025-06-13 23:28:52 +10:00
quantized.cpp 5bit quants (#2226) 2025-05-30 12:12:10 -07:00
reduce.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
reduce.h Reductions update (#1351) 2024-11-04 22:25:16 -08:00
resident.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
resident.h Wired (#1510) 2024-10-25 09:35:33 -07:00
rope.cpp Fast primitives decide when to use the fallback (#2216) 2025-06-02 13:26:37 -07:00
scaled_dot_product_attention.cpp Fix unintuitive metal kernel caching (#2242) 2025-06-06 20:08:15 -07:00
scan.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
slicing.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
softmax.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
sort.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
svd.cpp feat: Add benchmarking and documentation updates for Metal SVD 2025-06-14 17:28:19 +10:00
ternary.cpp Improve metal elementwise kernels (#2247) 2025-06-06 11:37:40 -07:00
ternary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
unary.cpp CUDA backend: unary ops (#2158) 2025-06-09 06:45:08 -07:00
unary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
utils.cpp Move some dims utils to common (#2223) 2025-05-29 06:48:30 -07:00
utils.h Improve metal elementwise kernels (#2247) 2025-06-06 11:37:40 -07:00