zhangyiss/mlx - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-08-12 20:26:40 +08:00

Author	SHA1	Message	Date
Awni Hannun	c098245664	docs update	2025-06-04 01:01:49 +00:00
Awni Hannun	981bf7ae2b	docs update	2025-06-04 01:01:49 +00:00
Awni Hannun	ce95a29690	docs update	2025-06-04 01:01:49 +00:00
Awni Hannun	9e69a72b8c	docs update	2025-06-04 01:01:48 +00:00
Awni Hannun	17470bf630	remove uneeded files in docs	2025-06-04 01:01:48 +00:00
Awni Hannun	a693e6e1d8	update docs	2025-06-04 01:01:48 +00:00
Awni Hannun	fd34610634	docs update	2025-06-04 01:01:48 +00:00
Awni Hannun	84bebc2161	docs up	2025-06-04 01:01:48 +00:00
Awni Hannun	9882295582	docs up	2025-06-04 01:01:48 +00:00
Awni Hannun	ebd913400a	docs update	2025-06-04 01:01:48 +00:00
Awni Hannun	217cdf3fc9	docs	2025-06-04 01:01:48 +00:00
Awni Hannun	43cd655ba1	docs	2025-06-04 01:01:48 +00:00
Awni Hannun	8c406bcb9b	update docs	2025-06-04 01:01:48 +00:00
Awni Hannun	01489e172d	docs	2025-06-04 01:01:48 +00:00
Awni Hannun	616449e363	docs	2025-06-04 01:01:48 +00:00
Awni Hannun	a66e6d3214	docs	2025-06-04 01:01:48 +00:00
Awni Hannun	d3d0ad9564	docs	2025-06-04 01:01:47 +00:00
Awni Hannun	a60a600c6a	docs	2025-06-04 01:01:47 +00:00
Awni Hannun	e84ebcf0b9	docs	2025-06-04 01:01:47 +00:00
Awni Hannun	372f2ac025	docs	2025-06-04 01:01:47 +00:00
Awni Hannun	80322b562e	docs	2025-06-04 01:01:47 +00:00
Awni Hannun	fbd10a48d4	docs	2025-06-04 01:01:47 +00:00
Angelos Katharopoulos	aede70e81d	Perf regression fix (#2243 )	2025-06-03 17:55:12 -07:00
Cheng	85a8beb5e4	Avoid atomic updates across CPU/GPU in CUDA event (#2231 )	2025-06-03 16:49:06 -07:00
Cheng	0bb89e9e5f	Share more common code in Compiled (#2240 ) * Share more common code in Compiled * Remove build_lib_name	2025-06-03 16:48:50 -07:00
Cheng	5685ceb3c7	Avoid invoking allocator::malloc when creating CUDA event (#2232 )	2025-06-03 16:48:40 -07:00
Suryash Malviya	0408ba0a76	Optimizing Complex Matrix Multiplication using Karatsuba’s Algorithm (#2220 ) * Implementing Complex Matmul using Karatsuba Algorithm * Implemented Karatsuba's Algorithm for complex matmul and pre-commit them * fix --------- Co-authored-by: Awni Hannun <awni@apple.com>	2025-06-02 15:58:46 -07:00
Awni Hannun	cbad6c3093	version (#2237 )	2025-06-02 15:58:33 -07:00
Cheng	1b021f6984	Fast primitives decide when to use the fallback (#2216 )	2025-06-02 13:26:37 -07:00
Cheng	95b7551d65	Do not check event.is_signaled() in eval_impl (#2230 )	2025-06-02 13:23:34 -07:00
Cheng	db5a7c6192	Add memory cache to CUDA backend (#2221 ) * Move BufferCache out of allocator * Add memory cache to cuda backend allocator * Simplify BufferCache assuming buf can not be null	2025-05-30 12:12:54 -07:00
Awni Hannun	6ef2f67e7f	5bit quants (#2226 ) * 5bit quants * 5bit quants	2025-05-30 12:12:10 -07:00
Cheng	f76ee1ffd2	Move some dims utils to common (#2223 )	2025-05-29 06:48:30 -07:00
Cheng	54a71f270a	Remove unused defines (#2217 )	2025-05-23 06:14:58 -07:00
Awni Hannun	55b4062dd8	copyright in docs (#2214 )	2025-05-21 17:13:04 -07:00
Cheng	79071bfba4	Fix out-of-bounds default value in logsumexp/softmax (#2213 )	2025-05-21 07:25:16 -07:00
Cheng	7774b87cbd	Remove redundant simd_sum in logsumexp (#2210 )	2025-05-21 07:25:03 -07:00
Cheng	35c87741cf	Build for compute capability 70 instead of 75 (#2209 )	2025-05-20 19:42:48 -07:00
Jack Wind	4cbe605214	Feat: Allow per-target Metal debug flags (#2201 ) * feat: allow per-target Metal debug flags * formatting fix	2025-05-20 10:22:26 -07:00
Clement Liaw	ab8883dd55	include mlx::core::version() symbols in the mlx static library (#2207 )	2025-05-20 07:39:11 -07:00
Awni Hannun	eebe73001a	fix large arg reduce (#2206 )	2025-05-19 13:10:44 -07:00
Angelos Katharopoulos	0359bf02c9	Nearest upsample (#2202 )	2025-05-19 11:23:38 -07:00
Cheng	237f9e58a8	Fix BEFORE keyword in target_include_directories (#2204 )	2025-05-19 06:10:44 -07:00
Awni Hannun	8576e6fe36	fix conv2d bug + faster conv 1d (#2195 ) * fix conv2d bug + faster conv 1d * revert sort + flaky test	2025-05-18 06:05:11 -07:00
Angelos Katharopoulos	0654543dcc	Add complex eigh (#2191 )	2025-05-18 00:18:43 -07:00
Awni Hannun	48ef3e74e2	reduce vjp for all and any (#2193 )	2025-05-16 08:38:49 -07:00
Cheng	7d4b378952	Include cuda_bf16.h for bfloat16 overloads (#2192 ) * Include cuda_bf16.h for bfloat16 overloads * Add NO_GPU_MULTI(Eig) in cuda backend	2025-05-16 06:44:42 -07:00
Jack Wind	7ff5c41e06	Add set_threadgroup_memory_length to CommandEncoder (#2183 )	2025-05-16 00:28:03 -07:00
Awni Hannun	602f43e3d1	fix conv grad (#2187 )	2025-05-15 19:20:36 -07:00
Awni Hannun	a2cadb8218	real and imag properties (#2189 )	2025-05-15 18:17:50 -07:00

1 2 3 4 5 ...

1193 Commits