mlx/mlx at 3d8c7583f2fb6cdc2ec0ca959c1ca995a4600cd2 - mlx - Gitea for Geophysics

zhangyiss/mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-09-18 01:50:16 +08:00

Files

History

Arkar Min Aung 3d8c7583f2 feat: Implement basic one-sided Jacobi SVD algorithm in Metal

- Add complete Metal kernel implementations for SVD computation:
  * svd_preprocess: Computes A^T * A matrix
  * svd_jacobi_iteration: Performs Jacobi rotations to diagonalize
  * svd_extract_singular_values: Extracts singular values from diagonal
  * svd_compute_vectors: Computes singular vectors (basic implementation)

- Update host-side implementation to orchestrate kernel execution:
  * Allocate workspace for A^T * A and rotation storage
  * Execute preprocessing, iteration, and extraction phases
  * Handle both singular values only and full SVD modes

- Add proper template instantiations for float and double precision

This provides a working Metal SVD implementation using the Jacobi method.
Performance optimizations and convergence checking will follow.

2025-06-13 23:34:36 +10:00

..

jagrit's commit files

2023-11-29 10:52:08 -08:00

feat: Implement basic one-sided Jacobi SVD algorithm in Metal

2025-06-13 23:34:36 +10:00

Make sliceUpdate general (#2282 )

2025-06-12 16:48:54 -07:00

Remove static initializers (#2059 )

2025-04-24 06:14:49 -07:00

fix pinv (#2110 )

2025-04-23 13:08:28 -07:00

allocator.cpp

Add stats and limit to common allocator and enable tests (#1988 )

2025-03-21 12:28:36 -07:00

allocator.h

Add stats and limit to common allocator and enable tests (#1988 )

2025-03-21 12:28:36 -07:00

array.cpp

reduce binary size (#1952 )

2025-03-11 06:30:44 -07:00

array.h

Add complex eigh (#2191 )

2025-05-18 00:18:43 -07:00

CMakeLists.txt

start cuda circle config (#2256 )

2025-06-10 21:19:47 -07:00

compile_impl.h

Simplify removes no-ops from the tape (#1759 )

2025-01-09 11:23:19 -08:00

compile.cpp

Share more common code in Compiled (#2240 )

2025-06-03 16:48:50 -07:00

compile.h

fix function pointer (#1865 )

2025-02-13 18:46:11 -08:00

device.cpp

Generalize gpu backend (#2138 )

2025-04-30 09:08:17 -07:00

device.h

Generalize gpu backend (#2138 )

2025-04-30 09:08:17 -07:00

dtype_utils.cpp

Introduce macros for dispatching dynamic dtypes as static types (#2073 )

2025-04-19 06:16:30 -07:00

dtype_utils.h

Introduce macros for dispatching dynamic dtypes as static types (#2073 )

2025-04-19 06:16:30 -07:00

dtype.cpp

fix double type promotion (#1901 )

2025-02-25 06:00:53 -08:00

dtype.h

Fp64 on the CPU (#1843 )

2025-02-07 15:52:22 -08:00

einsum.cpp

Einsum ellipsis (#1788 )

2025-01-25 01:28:03 -08:00

einsum.h

Einsum (#1269 )

2024-07-25 09:36:44 -07:00

event.h

Remove Event::Signal() (#2052 )

2025-04-08 06:20:27 -07:00

export_impl.h

Export / import functions to / from a file (#1642 )

2024-12-24 11:19:13 -08:00

export.cpp

fix export to work with gather/scatter axis (#2263 )

2025-06-09 20:37:27 -07:00

export.h

Use unordered map for kwargs in export/import (#2087 )

2025-04-21 07:17:22 -07:00

fast_primitives.h

Fast primitives decide when to use the fallback (#2216 )

2025-06-02 13:26:37 -07:00

fast.cpp

Fix unintuitive metal kernel caching (#2242 )

2025-06-06 20:08:15 -07:00

fast.h

Add new sdpa function overload (#2035 )

2025-04-03 11:58:28 -07:00

fence.h

redesign for faster cpu/gpu synch (#1869 )

2025-03-06 19:23:38 -08:00

fft.cpp

add fftshift and ifftshift fft helpers (#2135 )

2025-04-29 22:13:45 -07:00

fft.h

add fftshift and ifftshift fft helpers (#2135 )

2025-04-29 22:13:45 -07:00

graph_utils.cpp

Optionally specify names for arrays when exporting (#1749 )

2025-01-06 13:07:46 -08:00

graph_utils.h

Optionally specify names for arrays when exporting (#1749 )

2025-01-06 13:07:46 -08:00

io.h

Added missing unordered_map includes (#1635 )

2024-12-02 07:03:03 -08:00

linalg.cpp

Add complex eigh (#2191 )

2025-05-18 00:18:43 -07:00

linalg.h

non-symmetric eig and eigh (#2188 )

2025-05-15 13:01:44 -07:00

memory.h

move memory APIs into top level mlx.core (#1982 )

2025-03-21 07:25:12 -07:00

mlx.h

start cuda circle config (#2256 )

2025-06-10 21:19:47 -07:00

ops.cpp

Optimizing Complex Matrix Multiplication using Karatsuba’s Algorithm (#2220 )

2025-06-02 15:58:46 -07:00

ops.h

Fix typos (#2136 )

2025-04-29 07:26:05 -07:00

primitives.cpp

reduce vjp for all and any (#2193 )

2025-05-16 08:38:49 -07:00

primitives.h

fix conv export (#2265 )

2025-06-10 09:34:01 -07:00

random.cpp

Add random normal distribution for complex numbers (#2182 )

2025-05-13 22:43:45 -07:00

random.h

Add random normal distribution for complex numbers (#2182 )

2025-05-13 22:43:45 -07:00

scheduler.cpp

Generalize gpu backend (#2138 )

2025-04-30 09:08:17 -07:00

scheduler.h

Generalize gpu backend (#2138 )

2025-04-30 09:08:17 -07:00

stream.h

Export / import functions to / from a file (#1642 )

2024-12-24 11:19:13 -08:00

threadpool.h

Ring distributed backend (#1784 )

2025-01-27 22:15:01 -08:00

transforms_impl.h

Remove static initializers (#2059 )

2025-04-24 06:14:49 -07:00

transforms.cpp

Perf regression fix (#2243 )

2025-06-03 17:55:12 -07:00

transforms.h

Export / import functions to / from a file (#1642 )

2024-12-24 11:19:13 -08:00

utils.cpp

fix pinv (#2110 )

2025-04-23 13:08:28 -07:00

utils.h

fix pinv (#2110 )

2025-04-23 13:08:28 -07:00

version.cpp

Do not define MLX_VERSION globally (#1966 )

2025-03-18 07:12:40 -07:00

version.h

Perf regression fix (#2243 )

2025-06-03 17:55:12 -07:00