zhangyiss/mlx - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Author	SHA1	Message	Date
Awni Hannun	570f2bf29e	pick up preivously set attributes (#905 )	2024-03-26 11:19:59 -07:00
Angelos Katharopoulos	9948eddf11	Fix nan and improve speed for qvm (#903 )	2024-03-26 10:41:45 -07:00
Luca Arnaboldi	a3ee03da01	Fixing random.normal for half-precision dtype #642 (#904 ) * Fixing random.normal for half-precision dtype #642 * Update python/tests/test_random.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-03-26 09:58:27 -07:00
Cheng	28fcd2b519	Add missing && when forwarding args (#894 ) Without the && args would be copied and perfect forwarding won't work. Also add template utils to make sure the function only forwards array and not vector<array>.	2024-03-25 14:55:54 -07:00
Jack Mousseau	8e686764ac	Ensure shape dimensions are within supported integer range (#566 ) (#704 ) * Ensure shape dimensions are within supported integer range (#566) * fix build * fix rebase bug --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-25 13:29:45 -07:00
Daniel Strobusch	479051ce1c	add numeric type hierarchy and issubdtype as well as a set_dtype meth… (#427 ) * add numeric type hierarchy and issubdtype as well as a set_dtype method to nn.Module with predicate numeric type hierarchy and issubtype is compatible to the [numpy hierarchy](`220f0ab2c5/numpy/_core/numerictypes.py (L42)`). Closes #285. * nits in docs * unify type category checking * nits in docs * nits in docs * more docs nits * fix callable type --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-25 12:32:59 -07:00
Awni Hannun	bfb5bad4f0	patch (#893 ) v0.8.1	2024-03-24 21:03:59 -07:00
Awni Hannun	1e16331d9c	post nanobind docs fixes and some updates (#889 ) * post nanobind docs fixes and some updates * one more doc nit * fix for stubs and latex	2024-03-24 15:03:27 -07:00
Awni Hannun	be98f4ab6b	Reduce a little overhead (#871 ) * some small overhead improvements * use result_type in rms_norm * remove release force * fix + use non-vector version * revert compile change * fix ops * a little more overhead * a little more cleanup and overhead	2024-03-22 17:29:36 -07:00
Angelos Katharopoulos	6ee1112f30	Fix copy donation and add partial rope (#881 )	2024-03-22 17:28:26 -07:00
Jagrit Digani	8e5a5a1ccd	Set item bug fix (#879 ) * set item shaping bug fix * Add extra tests	2024-03-22 12:11:17 -07:00
Angelos Katharopoulos	fcda3a0e66	Increase test tolerance for fast.layer_norm (#880 )	2024-03-22 12:10:27 -07:00
Cheng	9663c22fe9	Do not store iostream in shared_ptr (#872 ) There is no need to store iostream in shared_ptr, doing so adds the cost of a heap allocation.	2024-03-22 06:54:45 -07:00
Cheng	f0ae00da12	Reduce implicit copies in make_array (#874 ) 1. Move shapes into outputs instead of copying them. 2. Pass primitive by const ref as it is always copied into outputs, which removes a copy when calling make_array.	2024-03-22 06:29:16 -07:00
Awni Hannun	44390bd3d0	Bump (#869 ) * bump * fix none in a few ops v0.8.0	2024-03-21 13:56:56 -07:00
Angelos Katharopoulos	2225374060	Adds mx.fast.layer_norm (#870 )	2024-03-21 13:55:51 -07:00
nicolov	105d236889	Add vmap for SVD and inverse (#849 )	2024-03-21 13:18:27 -07:00
Angelos Katharopoulos	53e6a9367c	Use reshape and transpose for non-overlapping pooling windows (#867 )	2024-03-21 10:21:03 -07:00
Chime Ogbuji	f5a1582fe8	Add minimum for cosine decay function (#859 ) * Add minimum for cosine decay function * Update python/mlx/optimizers/schedulers.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-03-21 07:33:29 -07:00
Awni Hannun	a54f06b16f	Fast RMS Norm (#862 ) * fast rmsnorm * no rms gpu * kernel * fix shared mem * looped rms and donation in softmax * Make the squaring in float32 to avoid underflow * Fix the default StreamOrDevice for rope and rms_norm in fast * nits --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-03-21 07:20:54 -07:00
Cheng	4650d94d98	Add missing && in eval (#864 ) Without the && args would be copied and perfect forwarding won't work. To avoid eval calling itself recursively, the vector version of eval is changed to take by value instead, which will save a copy of array when a rvalue is passed.	2024-03-21 06:15:48 -07:00
Jagrit Digani	a5681ebc52	Update set item (#861 ) * Update mlx_set_item to handle regular slices without expanding * Refactor ellipsis handling * Route mlx_set_item to slice_update where possible * Update mlx_scatter_args_slice * Don't route to gather if no array indices	2024-03-21 02:48:13 -07:00
Cheng	e849b3424a	Do not use static constexpr in header (#863 ) Doing so results in each compilation unit (.cpp file) having its own copy of the variable, while inline constexpr makes sure there is only one copy.	2024-03-20 21:28:05 -07:00
Jagrit Digani	b219d12a6b	Check edge case handling in row reduce med kernel (#858 )	2024-03-20 11:37:58 -07:00
Jagrit Digani	cec8661113	Add a SliceUpdate op and primitive (#850 ) * Enable copy to work with int64 strides * Fix uniform buffer indices or copy kernel arguments * Update utils.h * Remove manual unrolling of elem to loc loop * GPU copy updated to handle negative strides * Add slice update primitive	2024-03-20 10:39:25 -07:00
Cheng	73a8c090e0	Pass shape and inputs by value in array's constructor (#853 ) Since the shape and inputs are always saved as copy in ArrayDesc, we can unify array's constructors to just take the arguments by value. There are 2 cases: 1. When shape is a lvalue, it will be copied into array's constructor and then moved into ArrayDesc's member. So only 1 copy happens. 2. When shape is a rvalue, it will be moved into array's constructor and then moved into ArrayDesc's member. So no copy happens. So having 1 constructor that takes by value is equivalent to having 2 constructors that const reference and rvalue separately.	2024-03-20 07:54:30 -07:00
Md. Rasel Mandol	db6796ac61	simple typo `fille` (#848 )	2024-03-19 06:15:17 -07:00
Awni Hannun	9a8ee00246	Switch to nanobind (#839 ) * mostly builds * most tests pass * fix circle build * add back buffer protocol * includes * fix for py38 * limit to cpu device * include * fix stubs * move signatures for docs * stubgen + docs fix * doc for compiled function, comments	2024-03-18 20:12:25 -07:00
Cheng	d39ed54f8e	Some C++ code are not needed (#841 ) 1. Anonymous namespace means internal linkage, static keyword is not needed. 2. The default constructor of std::shared_ptr initializes the pointer to nullptr, you don't need to explicitly set it.	2024-03-18 17:04:10 -07:00
Awni Hannun	16546c70d8	No reshape rope (#838 ) * no reshape rope * no reshape rope	2024-03-18 17:03:07 -07:00
nicolov	eaba55c9bf	Add matrix inversion primitive (#822 )	2024-03-15 06:34:36 -07:00
Awni Hannun	19ec023256	vmap matmul and admm (#836 )	2024-03-14 14:38:22 -07:00
Awni Hannun	63ab0ab580	version (#835 ) v0.7.0	2024-03-14 12:20:40 -07:00
Jagrit Digani	8dfc376c00	Strided reduce specialization for small reductions (#826 ) * Add small column / general reduction specialization	2024-03-14 09:16:53 -07:00
Angelos Katharopoulos	1efee9db09	Add types and order in kernel name (#831 )	2024-03-13 20:34:06 -07:00
Awni Hannun	43abc402d8	route to fallback (#828 )	2024-03-13 19:56:04 -07:00
Angelos Katharopoulos	3f8b1668c4	Make reshape faster for row_contiguous cases (#829 )	2024-03-13 16:22:03 -07:00
Angelos Katharopoulos	76c919b4ec	NumberOfElements for shapeless compile and vmap fixes (#802 )	2024-03-13 10:34:14 -07:00
Angelos Katharopoulos	29d0c10ee5	Reshape improvement (#818 )	2024-03-12 17:54:31 -07:00
Jagrit Digani	5ad133f8bb	No copy gems (#801 ) * Enable collapsing batch dims in gemm * Update gemm to only make copies when neither of the last 2 axes are contiguous * Update addmm to support gemv shapes * Update addmm to support irregular batch strides * Update tests	2024-03-12 13:13:41 -07:00
nicolov	d0c544a868	Add SVD primitive (#809 ) Add SVD op using Accelerate's LAPACK following https://developer.apple.com/documentation/accelerate/ compressing_an_image_using_linear_algebra Co-authored-by: Nicolo Valigi <nvaligi@apple.com>	2024-03-12 12:30:11 -07:00
Daniel Falbel	ffb19df3c0	Fix docstring for correctly rendering (#820 )	2024-03-12 11:46:44 -07:00
Awni Hannun	8b7532b9ab	fix scatter (#821 )	2024-03-12 11:42:07 -07:00
Awni Hannun	366478c560	fix modules with dict (#819 )	2024-03-12 08:54:06 -07:00
Justin Deschenaux	8e5600022a	Implement RNN, GRU, LSTM (#268 ) * RNN base implementation * Address comments+format * nits in docs * add tests for prb * fix test * add a couple tests --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-03-11 21:14:44 -07:00
Awni Hannun	0e95b64942	Fix bug in tape order during simplify (#816 ) * fix bug in tape order during simplify * properly fix compile * last bug	2024-03-11 17:29:05 -07:00
nicolov	0ae22b915b	Remove code duplication in reduce ops (#793 ) * Remove code duplication in reduce ops * Remove the unnecessary lambda --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-03-11 10:57:07 -07:00
Awni Hannun	7c441600fe	Compile stride bug (#812 ) * fix compile stride bug * revert sdpa fix * fix cpu * fix bug with simplifying outputs	2024-03-11 06:31:31 -07:00
Awni Hannun	a4d290adb9	Remove depth traversal (#813 ) * no depth traversal * counter outside loop	2024-03-09 20:21:32 -08:00
Awni Hannun	28301807c2	Version bump and os error (#807 ) v0.6.0	2024-03-07 13:57:58 -08:00

... 10 11 12 13 14 ...

972 Commits