zhangyiss/mlx - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Author	SHA1	Message	Date
Luca Arnaboldi	cbefd9129e	Implementation of pickle, copy and deepcopy for Python arrays (#300 & #367 ). (#713 ) * Implemented pickling and copy for Python arrays(#300 & #367) * Fixing typos * Pickle with NumPy arrays * Pickle: workaround for bfloat16 * Revert "Pickle: workaround for bfloat16" This reverts commit `25afe6bc09`. * Added an error when pickling bfloat16 * Update python/tests/test_array.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/tests/test_array.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/array.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/array.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * clang-format applied --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-03-06 08:02:41 -08:00
Awni Hannun	cbcf44a4ca	Some fixes in cache / thread safety (#777 ) * some fixes in cache / thread safety * speed up no cache case * fix opt test * optimizer docs * otpimizer docs * fix adafactor * fix adafactor	2024-03-05 13:30:50 -08:00
Awni Hannun	859ae15a54	Fix test (#785 )	2024-03-04 23:02:27 -08:00
Brian Keene	0787724c44	Fast Inference SDPA op (#735 ) * Fast Inference SDPA op Implements metal shaders for: o = mx.fast_inference_sdpa(queries, keys, values, scale, mask) Supports fp16, fp32 dtypes; assumes d_k = 128. Generic op support / prompt encoding supported via mlx primitives. Metal implementation is for the inference use case only. Majority of performance benefits appears to results from GQA & reduced bandwidth requirements; there is approximate performance parity for the MHA use case (from some measurements on M3 Max). * Flush shared memory to zero before unprotected reads for (scores @ values) * Move to fast:: namespace, address reviewer comments ... also attempt to revert formatter auto-change for files not relevant to this change * Shared memory flush to top of kernel * Resolve compiler warnings * Update python/src/fast.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/fast.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/fast.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update python/src/fast.cpp Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update docstring per PR feedback * Softmax in higher precision, ... * route to fallback for more use cases - batch size > 1, head_dim other than 128, etc. * Address linux build failure * Address other reviewer comments * Remove extraneous eval_cpu function per review --------- Co-authored-by: Atila Orhon <64497909+atiorh@users.noreply.github.com> Co-authored-by: Awni Hannun <awni.hannun@gmail.com> Co-authored-by: atila <atiorh@icloud.com>	2024-03-04 21:06:11 -08:00
Awni Hannun	5121f028d9	nice tensordot for mlx c (#782 )	2024-03-04 09:51:02 -08:00
Piotr Rybiec	6a665ea6ed	Dilation for convolutional layers (#766 ) * add dilation parameter to Conv1d layer * space here too * add conv1d dilation test * add dilation parameter for Conv2d layer * conv2d dilation test	2024-03-04 06:43:00 -08:00
Awni Hannun	bc06cb9ff6	Pickle + dtype fix for numpy conversion (#763 ) * pickle + dtype fix for numpy conversion * fix getattribute on Module base * remove unused function * fix tests * add topk to ops * fix doc	2024-03-02 06:09:29 -08:00
Angelos Katharopoulos	8e281c76c3	Fix the top-k op (#768 )	2024-03-01 22:08:43 -08:00
Awni Hannun	d5964a2710	bindings for memory info (#761 ) * bindings for memory info * update api * keep cache low if requested * fix default * nit in ops error	2024-03-01 19:51:58 -08:00
Ikko Eltociear Ashimine	cf3eb87e52	Fix typo in transforms.cpp (#764 ) occuring -> occurring	2024-02-29 22:23:46 -08:00
Awni Hannun	4494970f47	avoid nested closures in module (#759 )	2024-02-29 09:39:52 -08:00
Jagrit Digani	776c3d226d	Convolution update (#651 ) * Init steel conv and update Conv primitive * Update slow CPU implementation to support flipping and input dilation winograd conv routing Co-authored-by: Awni Hannun <awni@apple.com>	2024-02-28 20:11:16 -08:00
Awni Hannun	420ff2f331	Add back compiled function signatures and docstrings (#749 ) * try to add back compiled function signatures and docstrings * add indentation to docstring	2024-02-27 13:18:59 -08:00
Noah Kasmanoff	de3d2467a3	Update: Fast GeLU Approximation (#744 ) * add: fast gelu approx * fix docs * Update gelu_fast_approx function documentation * Update python/mlx/nn/layers/activations.py Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * fix: test gelu --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-02-26 21:08:50 -08:00
Awni Hannun	fe1dabf272	Fix compile with non standard types (#745 ) * refactor tree utils * fix compile + tree code refactor * Add an extra test * add a few missing activations to docs * hash structure * Encode the full argument structure --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-26 19:28:53 -08:00
Hinrik Snær Guðmundsson	08226ab491	added atleast args input support (#710 ) added atleast list(array) input support * function overloading implemented * Refactoring * fixed formatting * removed pos_only	2024-02-26 11:17:59 -08:00
Chime Ogbuji	3b661b7394	Add linear warmup and schedule joining for use with existing schedules (#721 ) * Add linear warmup to schedules for use with existing schedules * Changed parameters for simplicity of most common case (0 initial value) * Added ScheduleJoiner and updated documentation * ScheduleJoiner -> join_schedules (ala optax #) * black compliance * Different evaluation of schedules * nits --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-02-26 07:28:48 -08:00
Awni Hannun	e6418781ab	Fix logsumexp edge case (#740 ) * fix logsumexp * fix inf constant * also fix power grad * fix ternary dispatch	2024-02-25 08:39:55 -08:00
Gabrijel Boduljak	22364c40b7	Upsample2d (#414 ) Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com> Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-02-23 09:55:04 -08:00
Noah Farr	d729a1991b	Fix arange with inf step (#686 ) * Fix case for step=inf in arange and add inf check for start/stop * Add test cases for arange * Update ops.cpp to include climits header * Fix arange * Fix formatting * Refactor * Add missing include	2024-02-23 06:18:15 -08:00
Awni Hannun	5798256fcf	Shapeless compilation for some graphs (#687 ) * shapeless compilation for some graphs * update compile benchmark * default compile a few activations * buffer donation * bugfix * shapeless fix * update tests to work for cpu and gpu fusion * test kwargs * add kwargs to compile * Recompile when python arguments change * no compile for tanh * some constant tests --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-19 21:43:54 -08:00
Awni Hannun	d0fda82595	fix tolist for half types (#702 )	2024-02-19 09:44:27 -08:00
Hinrik Snær Guðmundsson	f883fcede0	Added support for atleast_1d, atleast_2d, atleast_3d (#694 )	2024-02-19 09:40:52 -08:00
Srimukh Sripada	818cda16bc	Support LR schedulers (#334 ) * Add a few LR schedulers * Move parents's constructor call to the top * Fix docstring * refactor optimizers into two files * add docs * nit * Fix Callable type annotation for python 3.8 --------- Co-authored-by: Awni Hannun <awni@apple.com> Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-15 11:26:20 -08:00
toji	85143fecdd	improved error msg for invalid axis(`mx.split`) (#685 ) * improved error msg for invalid axis(`mx.split`) * Apply suggestions from code review Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * fixed formatting issue --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>	2024-02-15 07:25:38 -08:00
Diogo	35431a4ac8	Adds device context manager (#679 )	2024-02-14 14:14:58 -08:00
Awni Hannun	ccf1645995	Custom primitive + RoPE fat op (#676 ) * extensions start * rope custom op * fix build * docs + rope benchmark * fix test * Add a Metal kernel for RoPE * Fix position of traditional * transform tests * Move rope computation to float and fix tests * Fix the test and a typo * change to fast * fix no metal build --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-14 14:04:25 -08:00
Noah Farr	0c65517e91	Return empty array when repeats is 0 in mx.repeat (#681 ) * Return empty array when repeats is 0 * Add test case for repeats = 0	2024-02-13 17:49:31 -08:00
Gabrijel Boduljak	e54cbb7ba6	Pooling layers (#357 ) Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com> Co-authored-by: Awni Hannun <awni@apple.com>	2024-02-12 22:08:13 -08:00
Angelos Katharopoulos	40c108766b	Quantized matmul fix (#677 ) * Fix qmv for small or unaligned matrices * Fix qmm	2024-02-12 18:54:21 -08:00
Nripesh Niketan	0dbc4c7547	feat: Update pre-commit-config.yaml (#667 )	2024-02-11 06:08:20 -08:00
Awni Hannun	b96be943dc	bug fix (#658 )	2024-02-09 16:50:45 -08:00
Abdussamet Türker	b670485185	Remainder negative numerator bug fixed (#641 ) Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-09 16:49:14 -08:00
Diogo	b57bd0488d	Metadata support for safetensors (#639 ) * metadata support for safetensors * aliases making it alittle more readable * addressing comments * python binding tests	2024-02-08 19:33:15 -08:00
Awni Hannun	5c03efaf29	Compile docs (#653 ) * compile docs * docs nits + comments	2024-02-08 11:21:50 -08:00
LeonEricsson	7dccd42133	updated calls to use loc &scale (#643 )	2024-02-08 09:01:59 -08:00
Awni Hannun	1b97b2958b	Compile with capture (#629 ) * Simple kernel generation * Remove the generate kernel from graph_utils * fix multi-output with compile * fuse with stopgrad * v1 input, output capture in compile * cleanup tree update with visitor update * nit * remove todo * state for model, optional explicit init and more pure optimizer steps * move learning rate to state * add lr to opt state, some fixes in capture * fix optim * update tuple of containers as well * fix stream for compiled output * rng state for compile * nit * updates and comments --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-02-07 17:29:22 -08:00
Awni Hannun	e5e816a5ef	fix sequential with empty modules at end (#647 )	2024-02-07 13:22:27 -08:00
Noah Farr	5fd11c347d	Add loc and scale to random.normal (#638 ) * Add loc and scale to random.normal * Add tests for loc and scale for random.normal * Run pre-commit hooks * Fix code review	2024-02-07 11:49:59 -08:00
Aryan Gupta	ef73393a19	Feat: Add weights argument in BCE Loss and tests (#620 )	2024-02-07 09:39:52 -08:00
Angelos Katharopoulos	ea406d5e33	CI change (#645 ) * CI update * Skip large binary test for now * Upgrade pip * Add proper env variable skipping * Update the CI * Fix workflow name * Set the low memory flag for the tests * Change build process * Add pip upgrade * Use a venv * Add a missing env activate * Add setuptools * Add twine upload back * Re-enable automatic release builds	2024-02-07 06:04:34 -08:00
Awni Hannun	d40a04f8dc	minor fixes (#631 ) * minor fixes * var with ddof >= nelements	2024-02-05 13:27:49 -08:00
Awni Hannun	d75ae52ecd	Compile primitive (#571 ) * Compiled primitive with basic binary, unary graph-level fusion	2024-02-05 06:51:22 -08:00
Awni Hannun	5c3ac52dd7	fix test (#627 )	2024-02-04 16:18:03 -08:00
Avikant Srivastava	11a9fd40f0	fix: handle linspace function when num is 1 (#602 ) * fix: handle linspace function when num is 1 * add comment * fix test case * remove breakpoint	2024-02-04 11:03:49 -08:00
Daniel Strobusch	4fd2fb84a6	make python array SupportsAbs conform (like numpy) (#624 )	2024-02-04 09:31:02 -08:00
Daniel Strobusch	9852af1a19	fix "shape" docstring. (#623 )	2024-02-04 09:21:22 -08:00
AtomicVar	83f63f2184	Add Margin Ranking Loss (#536 )	2024-02-02 10:57:31 -08:00
Awni Hannun	cb6156d35d	Fix eval in trace bugs (#612 ) * Fix eval in trace bugs * comment nit	2024-02-02 09:57:12 -08:00
Awni Hannun	e88e474fd1	Reduce vmap + some fixes (#601 )	2024-02-01 11:30:28 -08:00

1 2 3 4 5 ...

279 Commits