Commit Graph

  • 83a0340fa7 allow command (#1836) Awni Hannun 2025-02-06 10:32:24 -08:00
  • a62fc1b39f chore: pre-commit bump (#1837) Nripesh Niketan 2025-02-06 16:55:01 +00:00
  • af1b725fda Fix a couple of slicing bugs (#1827) Awni Hannun 2025-02-05 19:50:08 -08:00
  • 9174606d4c fix sort (#1835) Awni Hannun 2025-02-05 17:16:27 -08:00
  • ca305afdbe loading empty list is ok when strict = false (#1834) Awni Hannun 2025-02-05 16:19:27 -08:00
  • fe5987b81d faster sort (#1831) Awni Hannun 2025-02-05 06:10:22 -08:00
  • a229c8cef0 don't duplicate malloc with custom kernel init (#1830) Awni Hannun 2025-02-04 13:20:57 -08:00
  • f6c0499b8d Resolved ambiguity in mlx::core::take_along_axis (#1822) Jesper Stemann Andersen 2025-02-04 15:06:17 +01:00
  • 1156c84e86 Refactor common into cpu specific and truly common (#1817) Awni Hannun 2025-02-03 15:58:02 -08:00
  • ec7c7def40 no line buffer for mpi jobs (#1825) Awni Hannun 2025-02-03 12:02:15 -08:00
  • 2d8e667400 MinGW support (#1806) Jesper Stemann Andersen 2025-02-01 21:40:06 +01:00
  • 80c863b972 Remove accelerate/ (#1816) Awni Hannun 2025-02-01 07:18:26 -08:00
  • f5cc1eea72 Allow different value dimensions in sdpa_vector (#1811) Angelos Katharopoulos 2025-01-31 20:58:59 -08:00
  • b7c9f1d38f scatter axis + gather axis primitives (#1813) Awni Hannun 2025-01-31 20:48:08 -08:00
  • c6fc07f1f4 Unify CPU matmuls, remove unused accelerate conv (#1814) Awni Hannun 2025-01-31 14:43:37 -08:00
  • ded914f442 Small distributed launch helper (#1810) Angelos Katharopoulos 2025-01-29 17:55:04 -08:00
  • 4758c8baa1 Start to cleanup/unify accelerate and common back-ends (Part 1/N) (#1777) Awni Hannun 2025-01-29 14:34:49 -08:00
  • 7064fed1b1 Minor update on MPI docs (#1805) Awni Hannun 2025-01-28 11:00:08 -08:00
  • 1017ac4a9e add dilation for conv 3d layers + test for 3d conv w/ dilation (#1802) Awni Hannun 2025-01-28 06:17:07 -08:00
  • ccb61d7aae Ring distributed backend (#1784) Angelos Katharopoulos 2025-01-27 22:15:01 -08:00
  • 2235dee906 catch stream errors earlier to avoid aborts (#1801) Awni Hannun 2025-01-27 14:05:43 -08:00
  • 28091aa1ff allow build python lib without specifying path (#1799) Awni Hannun 2025-01-27 11:22:35 -08:00
  • 121d9a0702 Fix rope fallback to not upcast (#1797) Awni Hannun 2025-01-26 19:07:21 -08:00
  • 0cea88bcc5 Use @ matrix multiplication syntax to document matrix-matrix multiplication (#1793) Nick 2025-01-25 16:02:36 -08:00
  • 72146fc4cd Einsum ellipsis (#1788) Angelos Katharopoulos 2025-01-25 01:28:03 -08:00
  • e6a7ab9675 non square qr (#1783) Awni Hannun 2025-01-21 14:07:47 -08:00
  • 1f4c127fb9 Move some kernels to get_template_definition (#1782) Angelos Katharopoulos 2025-01-21 08:59:44 -08:00
  • 4515866024 Change the linux test to ubuntu 24.04 cpp20 Angelos Katharopoulos 2025-01-20 20:55:10 -08:00
  • 90532b1f37 recompile when shapeless is different (#1776) Awni Hannun 2025-01-20 21:07:10 -08:00
  • a8666a757a fix shapeless compile on ubuntu24 (#1775) Awni Hannun 2025-01-18 06:04:36 -08:00
  • a4667da1eb Faster synchronization Fence primitive (#1773) Awni Hannun 2025-01-17 18:42:19 -08:00
  • 6fe2b82926 Add gcc 13 to the linux build Angelos Katharopoulos 2025-01-17 18:00:24 -08:00
  • c75b5e9d19 Do not build wheels every time Angelos Katharopoulos 2025-01-17 17:42:11 -08:00
  • 6f12eda549 Test for 13.5 as well Angelos Katharopoulos 2025-01-17 17:39:20 -08:00
  • a541fe9312 Set deployment target 13.5 Angelos Katharopoulos 2025-01-17 14:22:48 -08:00
  • 2bdd20f257 Test XCode 16 Angelos Katharopoulos 2025-01-17 14:14:45 -08:00
  • aa7b9688ce Move some kernels to get_template_definition Angelos Katharopoulos 2025-01-17 13:05:22 -08:00
  • 0a41393dba Replace fmt::format with std::format Angelos Katharopoulos 2025-01-17 11:24:16 -08:00
  • 0c259961ac matmul jvps (#1772) Awni Hannun 2025-01-17 10:36:26 -08:00
  • e300a01f4a Testing C++ 20 Angelos Katharopoulos 2025-01-16 18:20:50 -08:00
  • f288db8d34 Fix synchronization bug for in stream async works (#1768) Awni Hannun 2025-01-15 06:07:34 -08:00
  • 33421c1dd3 Limit grad recursion depth by not recursing through non-grad inputs (#1764) Awni Hannun 2025-01-14 14:33:18 -08:00
  • 5cc5201914 feat: Add orthogonal initializer and corresponding tests (#1651) Nripesh Niketan 2025-01-13 15:29:20 +00:00
  • 252e423e81 fix and cleanup event signal/wait for metal (#1765) Awni Hannun 2025-01-10 18:37:26 -08:00
  • a4a2764a52 Fix broadcast_arrays python sig (#1763) wrmsr 2025-01-10 15:33:26 -05:00
  • ab8e832c18 0ul is not size_t on MSVC (#1762) Cheng 2025-01-11 05:33:11 +09:00
  • 1ce0c0fcb0 Bump version (#1761) v0.22.0 Angelos Katharopoulos 2025-01-09 13:48:20 -08:00
  • 657f466402 use sdpa and exportable functions in transformer multi head attention (#1760) Awni Hannun 2025-01-09 13:11:55 -08:00
  • c7b0300af5 Fix batched qmv bug (#1758) Alex Barron 2025-01-09 11:45:57 -08:00
  • da8c885784 Simplify removes no-ops from the tape (#1759) Awni Hannun 2025-01-09 11:23:19 -08:00
  • 1ccaf80575 Dynamic broadcasting for shapeless compile/export (#1722) Awni Hannun 2025-01-09 11:04:24 -08:00
  • ec36bfa317 Include command stdout in error message (#1756) Cheng 2025-01-09 00:17:03 +09:00
  • b8f76f717a Print exceptions in eval_cpu/eval_gpu and abort (#1754) Cheng 2025-01-08 23:31:09 +09:00
  • d1766f2c70 Add boolean mask support in vector SDPA (#1757) Awni Hannun 2025-01-07 20:24:53 -08:00
  • 516ded618b Dynamic slicing (#1741) Awni Hannun 2025-01-07 14:02:16 -08:00
  • c9c81d0584 Added additional missing unordered_map include that fixes build on FreeBSD (#1755) Jesper Stemann Andersen 2025-01-07 17:27:55 +01:00
  • 545f84d905 Refactor distributed backend (#1752) Angelos Katharopoulos 2025-01-06 17:33:15 -08:00
  • d5ec172c95 Allow boolean mask in sdpa (#1753) Awni Hannun 2025-01-06 16:57:07 -08:00
  • 25b3a3e541 Optionally specify names for arrays when exporting (#1749) Angelos Katharopoulos 2025-01-06 13:07:46 -08:00
  • 058d6ce683 mpi send use input as output (#1750) Awni Hannun 2025-01-06 06:08:43 -08:00
  • eab93985b8 Update custom function docs (#1748) Angelos Katharopoulos 2025-01-03 16:35:25 -08:00
  • b51d70a83c export docs (#1747) Awni Hannun 2025-01-03 15:04:17 -08:00
  • 259025100e Fix nd ternary on GPU (#1746) Awni Hannun 2025-01-03 11:52:17 -08:00
  • c9d30aa6ac MLX in C++ example (#1736) Awni Hannun 2025-01-02 19:09:04 -08:00
  • 8544b42007 Add namespace (#1745) Angelos Katharopoulos 2025-01-02 16:49:23 -08:00
  • 6fa0501387 Fix concatenate/slice_update vjp + reduce binary size (#1735) Awni Hannun 2025-01-02 16:36:33 -08:00
  • ae69cb15e9 shapeless compile in docs and partially shapeless reshape (#1742) Awni Hannun 2025-01-02 16:24:42 -08:00
  • a64a8dfe45 fix extension (#1740) Awni Hannun 2025-01-02 16:16:16 -08:00
  • 491fa95b1f Added Kronecker Product (#1728) Venkata Naga Aditya Datta Chivukula 2025-01-02 17:00:34 -07:00
  • 92ec632ad5 Fix Distributed Communication documentation (#1731) Danilo Peixoto 2025-01-02 19:08:38 -03:00
  • 8ecdfb718b Fix export.cpp compilation with MSVC (#1737) Cheng 2024-12-29 23:56:30 +09:00
  • 4ba0c24a8f Export / import functions to / from a file (#1642) Awni Hannun 2024-12-24 11:19:13 -08:00
  • 935c8c4bb1 Make mx.compile work on Windows (#1697) Cheng 2024-12-25 00:02:33 +09:00
  • 88f993da38 Explicit parentheses around some logical operators (#1732) Valentin Roussellet 2024-12-24 07:02:20 -08:00
  • ebfe64b92d shapeless slice update and broadcast when possible (#1727) Awni Hannun 2024-12-23 11:25:15 -08:00
  • 0308e9af71 Allow offset to be an mx.array for mx.fast.rope (#1724) Awni Hannun 2024-12-19 15:51:44 -08:00
  • c3628eea49 Add mx.finfo and use it when making causal mask (#1726) Awni Hannun 2024-12-19 14:52:41 -08:00
  • e03f0372b1 More shape type (#1705) Awni Hannun 2024-12-19 08:08:20 -08:00
  • f17536af9c More lenient mask type check in SDPA (#1723) Alex Barron 2024-12-18 19:41:38 -08:00
  • ed4ec81bca Link python extension with mlx statically on Windows (#1716) Cheng 2024-12-19 12:26:04 +09:00
  • 7480059306 track resource limit and throw if exceeded (#1718) Awni Hannun 2024-12-18 18:45:58 -08:00
  • 8bae22b0fa fix deletion of non-evaled arrays with siblings (#1714) Awni Hannun 2024-12-18 18:45:36 -08:00
  • 49c34c4161 check mask type (#1721) Alex Barron 2024-12-18 14:25:18 -08:00
  • 5548fcc96d fix synch race (#1719) Awni Hannun 2024-12-18 12:25:16 -08:00
  • 070bd433ab Shorter kernel name for Windows (#1701) Cheng 2024-12-18 11:51:38 +09:00
  • c8fb54951a Define NOMINMAX before windows.h (#1715) Cheng 2024-12-18 11:51:24 +09:00
  • f110357aaa Bump nanobind to 2.4 + fix (#1710) Awni Hannun 2024-12-17 10:57:54 -08:00
  • a6b426422e add cubic to type hinting for upsample (#1709) Tomohiro Oga 2024-12-17 15:30:23 +00:00
  • d03c01dfbc fix unflatten vjp (#1708) Awni Hannun 2024-12-16 18:37:57 -08:00
  • a82996e9fb io/load: Enabled pread implementation for mingw32 (#1706) Jesper Stemann Andersen 2024-12-16 16:20:45 +01:00
  • af5a614aad Eval before cleanup so model file is unlocked (#1702) Cheng 2024-12-15 14:41:49 +09:00
  • f9640e049d Install mlx.dll into the same dir with python bindings on Windows (#1690) Cheng 2024-12-14 12:50:39 +09:00
  • 4768c61b57 Make sure gguf_ctx is closed when error happens (#1699) Cheng 2024-12-14 12:50:19 +09:00
  • dfccd17ab9 Use psutil to get memory info on Windows (#1700) Cheng 2024-12-14 12:50:13 +09:00
  • 635117c5d4 Read/write files in binary mode (#1698) Cheng 2024-12-14 10:37:05 +09:00
  • 50f3535693 Use expand_dims / unflatten / etc in more places (#1696) Awni Hannun 2024-12-12 17:00:44 -08:00
  • 9111999af3 Fix small sort with metal validation (#1695) Awni Hannun 2024-12-12 09:21:45 -08:00
  • 6bd28d246e Allow no copy negative strides in as_strided and slice (#1688) Awni Hannun 2024-12-12 08:59:45 -08:00
  • 4d595a2a39 Make compiled preamble work in MSVC (#1675) Cheng 2024-12-13 01:55:49 +09:00
  • 3a21f61772 Fix build (#1693) Awni Hannun 2024-12-11 23:56:25 -08:00