zhangyiss/mlx - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-08-09 18:56:39 +08:00

Author	SHA1	Message	Date
Awni Hannun	e84ba8056d	only allow openmpi (#1209 )	2024-06-13 12:14:44 -07:00
Awni Hannun	df964132fb	fix scatter + test (#1202 ) * fix scatter + test * fix test warnings * fix metal validation	2024-06-11 14:35:12 -07:00
Alex Barron	27d70c7d9d	Feature complete Metal FFT (#1102 ) * feature complete metal fft * fix contiguity bug * jit fft * simplify rader/bluestein constant computation * remove kernel/utils.h dep * remove bf16.h dep * format --------- Co-authored-by: Alex Barron <abarron22@apple.com>	2024-06-06 12:57:25 -07:00
Awni Hannun	ea9090bbc4	Add view op (#1179 ) * add view primitive * nit * fix view	2024-06-04 08:05:27 -07:00
Rifur13	9401507336	Add groups to 2-D convolutions (#1129 ) * Added groups to 2-D convolutions. Only implemented for some specializations. Also fixed 1D grouped convs with different kernel strides and added more tests. * fix channels condition	2024-05-22 20:01:44 -07:00
Abe Leininger	79ef49b2c2	add mx.trace (#1143 ) (#1147 ) * working c++ trace implementation * updated throw + added overloads * added python binding for trace function * pre-commit reformatting * add trace to docs * resolve comments * remove to_stream call	2024-05-22 15:50:27 -07:00
Luca Arnaboldi	b3ec792380	Implemented Cholesky on CPU (#1119 )	2024-05-17 12:31:59 -07:00
Rifur13	c4a471c99d	Add groups to Conv1d (#948 ) * Add conv1d grouped convs on CPU * Add GPU support * Parallelize inside metal kernel * clenaup * Update mlx/ops.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * New unfold kernel + remove unused code * Remove copy and refactor * Update vjp and reuse steel gemm * Fixed groups on cpu * Fix metal validation --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-04-27 06:24:57 -07:00
Awni Hannun	86f495985b	Add bitwise ops (#1037 ) * bitwise ops * fix tests	2024-04-26 22:03:42 -07:00
Awni Hannun	771575d27b	Expose function to clear memory cache (#1032 ) * expose function to clear memory cache * fix linux build * fix metal tests	2024-04-24 16:48:51 -07:00
Aneesh Shetty	d0dbfe0b97	Adds radians and degrees (#1011 )	2024-04-22 11:17:49 -07:00
Awni Hannun	3d405fb3b1	Add synchronize function (#1006 ) * add synchronize function * fix linux * fix linux * fix and fix docs * fix test * try synchronize in stream destroy * synchronize works for both cpu and gpu	2024-04-22 08:25:46 -07:00
Awni Hannun	8a0677d56d	Shared events for synchronization + async eval (#998 ) * more async eval * fix rebase * try correct async eval * fix async * more tests for async eval * use shared events for synchronization * comment + cleanup * with autorelease pool * fix no metal build * fix compile * fix patch * don't eval if asyn evale'd * don't use is_evaled * comments * more multi stream tests * try and cleanup use of is_evaled * use a status flag	2024-04-17 06:16:02 -07:00
Nripesh Niketan	ffff671273	Update pre-commit hooks (#984 )	2024-04-11 07:27:53 -07:00
Luca Arnaboldi	fffe072028	Implementation of mlx.random.multivariate_normal (#502 ) (#877 ) * Implementation of mlx.random.multivariate_normal (#502) * Update python/src/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Updated typo in docstring * Restricted multivariate_normal to float32 * Generic mean and variance shapes * Review edits * Update mlx/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/random.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Test for ndim of mean and cov * nits * smaller size for test * fix broadcasted sampling --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-04-09 13:50:12 -07:00
Abe Leininger	a1a31eed27	Add mx.meshgrid (#961 )	2024-04-09 11:43:08 -07:00
Awni Hannun	42afe27e12	std and expm1 (#973 ) * std and expm1 * actually add expm1 * fix linux * fix vjp * relax tol for linux test * Add it to the compilable primitives --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-04-08 14:26:01 -07:00
Angelos Katharopoulos	c4fd0e5ede	Fixes #918 bug in compile_tests (#919 )	2024-03-27 22:37:37 -07:00
Cheng	90dfa43ff1	Don't use make_unique to create shared_ptr (#902 ) The code compiled because shared_ptr's constructor actually accepts unique_ptr.	2024-03-27 06:13:29 -07:00
Jack Mousseau	8e686764ac	Ensure shape dimensions are within supported integer range (#566 ) (#704 ) * Ensure shape dimensions are within supported integer range (#566) * fix build * fix rebase bug --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-25 13:29:45 -07:00
Daniel Strobusch	479051ce1c	add numeric type hierarchy and issubdtype as well as a set_dtype meth… (#427 ) * add numeric type hierarchy and issubdtype as well as a set_dtype method to nn.Module with predicate numeric type hierarchy and issubtype is compatible to the [numpy hierarchy](`220f0ab2c5/numpy/_core/numerictypes.py (L42)`). Closes #285. * nits in docs * unify type category checking * nits in docs * nits in docs * more docs nits * fix callable type --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-25 12:32:59 -07:00
nicolov	105d236889	Add vmap for SVD and inverse (#849 )	2024-03-21 13:18:27 -07:00
Jagrit Digani	cec8661113	Add a SliceUpdate op and primitive (#850 ) * Enable copy to work with int64 strides * Fix uniform buffer indices or copy kernel arguments * Update utils.h * Remove manual unrolling of elem to loc loop * GPU copy updated to handle negative strides * Add slice update primitive	2024-03-20 10:39:25 -07:00
nicolov	eaba55c9bf	Add matrix inversion primitive (#822 )	2024-03-15 06:34:36 -07:00
Angelos Katharopoulos	29d0c10ee5	Reshape improvement (#818 )	2024-03-12 17:54:31 -07:00
nicolov	d0c544a868	Add SVD primitive (#809 ) Add SVD op using Accelerate's LAPACK following https://developer.apple.com/documentation/accelerate/ compressing_an_image_using_linear_algebra Co-authored-by: Nicolo Valigi <nvaligi@apple.com>	2024-03-12 12:30:11 -07:00
Awni Hannun	8b7532b9ab	fix scatter (#821 )	2024-03-12 11:42:07 -07:00
Awni Hannun	7c441600fe	Compile stride bug (#812 ) * fix compile stride bug * revert sdpa fix * fix cpu * fix bug with simplifying outputs	2024-03-11 06:31:31 -07:00
Awni Hannun	5121f028d9	nice tensordot for mlx c (#782 )	2024-03-04 09:51:02 -08:00
Awni Hannun	bc06cb9ff6	Pickle + dtype fix for numpy conversion (#763 ) * pickle + dtype fix for numpy conversion * fix getattribute on Module base * remove unused function * fix tests * add topk to ops * fix doc	2024-03-02 06:09:29 -08:00
Angelos Katharopoulos	8e281c76c3	Fix the top-k op (#768 )	2024-03-01 22:08:43 -08:00
Awni Hannun	d5964a2710	bindings for memory info (#761 ) * bindings for memory info * update api * keep cache low if requested * fix default * nit in ops error	2024-03-01 19:51:58 -08:00
Hinrik Snær Guðmundsson	08226ab491	added atleast args input support (#710 ) added atleast list(array) input support * function overloading implemented * Refactoring * fixed formatting * removed pos_only	2024-02-26 11:17:59 -08:00
Awni Hannun	e6418781ab	Fix logsumexp edge case (#740 ) * fix logsumexp * fix inf constant * also fix power grad * fix ternary dispatch	2024-02-25 08:39:55 -08:00
Rifur13	126c9869c8	Implement the 'where' primitive for conditional selection (#664 )	2024-02-22 15:10:48 -08:00
Vijay Krish	972d9a3aea	Up to 10x faster scatter. (#709 ) * Faster scatter. Add specialization for 1-d index tensors. * Address review comments. - Check for row contiguity of index, update tensors instead of checking strides. - Add support for 1d specialization with col contiguous update tensor, along with a test. * Nit1 Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Nit2 Co-authored-by: Awni Hannun <awni.hannun@gmail.com> --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-02-21 11:09:30 -08:00
Awni Hannun	5798256fcf	Shapeless compilation for some graphs (#687 ) * shapeless compilation for some graphs * update compile benchmark * default compile a few activations * buffer donation * bugfix * shapeless fix * update tests to work for cpu and gpu fusion * test kwargs * add kwargs to compile * Recompile when python arguments change * no compile for tanh * some constant tests --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-19 21:43:54 -08:00
Hinrik Snær Guðmundsson	f883fcede0	Added support for atleast_1d, atleast_2d, atleast_3d (#694 )	2024-02-19 09:40:52 -08:00
Diogo	e1bdf6a8d9	discover doctests in cmake (#703 )	2024-02-19 07:03:56 -08:00
Awni Hannun	1a4f4c5ea6	Refactor CPU compile preamble (#708 ) * refactor cpu preamble * fix include order * fix some issues' * fixes for linux * try to fix includes * add back warning suppression * more linux fixes	2024-02-19 06:12:53 -08:00
Awni Hannun	dc937b8ed3	CPU compile (#691 ) * build and load shared object for cpu compile * nits * cpu compile tests pass * cpu compile tests pass * fix preamble for g++ * donation * fix gpu buffer donation * reuse prebuilt libraries * faster contiguity conditoins * fix test * rid compiler warning * fast erf * Fix float16 for compile and add more types to cpu compile * Remove a forgotten comment * use cached libs * nits --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-17 06:54:32 -08:00
Awni Hannun	1eb04aa23f	Fix empty array construction in cpp (#684 )	2024-02-13 23:34:17 -08:00
Diogo	b57bd0488d	Metadata support for safetensors (#639 ) * metadata support for safetensors * aliases making it alittle more readable * addressing comments * python binding tests	2024-02-08 19:33:15 -08:00
Angelos Katharopoulos	28eac18571	Kernel generation (#614 ) Generate reusable element-wise kernels given a computation graph.	2024-02-07 13:15:59 -08:00
Awni Hannun	146bd69470	Skip compile when transforming (#635 ) * skip compile when transforming * simplify message	2024-02-05 21:28:37 -08:00
Awni Hannun	d75ae52ecd	Compile primitive (#571 ) * Compiled primitive with basic binary, unary graph-level fusion	2024-02-05 06:51:22 -08:00
Angelos Katharopoulos	0de5988f92	Custom VJP and checkpointing (#541 ) * Implement custom_vjp and checkpointing * Add a dependency management primitive * Change the eval order to deep branches first * Add graph depth tracking to the array	2024-01-30 16:04:45 -08:00
Jacket	3f7aba8498	Implement diagonal operator (#562 ) * Implement diagonal operator This implements mx.diagonal in operator level, inspired by @ManishAradwad. * added `mx.diag` with tests * corrected few things * nits in bindings * updates to diag --------- Co-authored-by: ManishAradwad <manisharadwad@gmail.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-01-30 09:45:48 -08:00
Awni Hannun	3c2f192345	Propagate nans in binary ops (#579 ) * propagate nans in binary ops * handle empty matmul * cpu minimum/maximum propagate nan * benchmark maximum * add min as well * throw on negative indices with full * verbose on linux * fix matmul for zero K	2024-01-29 11:19:38 -08:00
Awni Hannun	8fa6b322b9	Compile front-end (#476 ) * fix tests for linux * make a move on compile * basic compile scaffold works * compile binding * clean * fix * fix grad, more tests * basic python tests * fix segfault on python exit * compile works with python closures * fix test * fix python globals bug, and erase * simplify * more cpp tests * bug fix with move function and compile at exit * simplify inputs also * enable and disable compiler * remove simplify * simplify tests use compile now * fix multi-output with compile * clear output tree from cache when function goes out of scope * ../python/src/transforms.cpp * remove closure capture * comments	2024-01-26 13:45:30 -08:00

1 2

98 Commits