Commit Graph

  • 8f3d208dce Close a couple edge case bugs: hadamard and addmm on empty inputs (#2177) Awni Hannun 2025-05-12 10:48:57 -07:00
  • caaa3f1f8c Small typos in mx.metal deprecations (#2176) Ivan Fioravanti 2025-05-11 15:03:47 +02:00
  • 659a51919f patch bump (#2162) v0.25.2 Awni Hannun 2025-05-09 14:35:14 -07:00
  • 6661387066 Fix fft for integer overflow (#2161) Awni Hannun 2025-05-09 14:25:12 -07:00
  • a7fae8a176 fix: conv_general differences between gpu, cpu (#2070) ATurker 2025-05-09 20:26:52 +03:00
  • 83762691ba Fix four step fft fft Angelos Katharopoulos 2025-05-08 14:14:59 -07:00
  • 2a41caa00e Add single kernel bluestein Angelos Katharopoulos 2025-05-08 13:15:20 -07:00
  • 6593281d25 Refactored four-step Angelos Katharopoulos 2025-05-08 00:25:38 -07:00
  • da98e8bce8 Refactored stockham Angelos Katharopoulos 2025-05-06 21:46:21 -07:00
  • be57a16a80 More tmp fft changes Angelos Katharopoulos 2025-04-30 22:29:22 -07:00
  • 1704809f29 Tmp FFT commit Angelos Katharopoulos 2025-04-30 15:12:39 -07:00
  • 0cae0bdac8 CUDA backend: backbone (#2075) Cheng 2025-05-07 13:26:46 +09:00
  • 7c99acb799 split logsumexp split_logsumexp Awni Hannun 2025-05-06 17:10:14 -07:00
  • 5a1a5d5ed1 fix input coherent kernel launch (#2153) Awni Hannun 2025-05-05 17:30:50 -07:00
  • 1683975acf Move common gpu primitives to backend/gpu (#2145) Cheng 2025-05-06 05:45:29 +09:00
  • af705590ac fix batched vector sdpa (#2152) Awni Hannun 2025-05-05 13:13:03 -07:00
  • 825124af8f fix bw for elementwise ops (#2151) Awni Hannun 2025-05-05 06:15:04 -07:00
  • 9c5e7da507 fix compile merging (#2150) Awni Hannun 2025-05-02 15:08:50 -07:00
  • 481349495b GPU Hadamard for large N (#1879) Angelos Katharopoulos 2025-02-18 13:43:09 -08:00
  • 9daa6b003f fix shapeless export (#2148) Awni Hannun 2025-05-01 15:02:02 -07:00
  • a3a632d567 Fix the launcher when ran locally (#2147) Angelos Katharopoulos 2025-05-01 12:56:09 -07:00
  • e496c5a4b4 fix integer overflow in qmm (#2143) Awni Hannun 2025-04-30 09:28:56 -07:00
  • ea890d8710 Remove metal-only tests (#2139) Cheng 2025-05-01 01:08:39 +09:00
  • aa5d84f102 Allow quant layer to be unfrozen (#2142) Awni Hannun 2025-04-30 09:08:29 -07:00
  • f1606486d2 Generalize gpu backend (#2138) Awni Hannun 2025-04-30 09:08:17 -07:00
  • 87720a8908 Fix building with uv (#2141) Cheng 2025-04-30 22:04:07 +09:00
  • bb6565ef14 add fftshift and ifftshift fft helpers (#2135) Aashiq Dheeraj 2025-04-30 01:13:45 -04:00
  • 7bb063bcb3 Enable vjp for quantized scale and bias (#2129) Awni Hannun 2025-04-29 13:03:09 -07:00
  • b36dd472bb return library if it is successfully loaded (#2131) Alex Chi Z. 2025-04-29 10:30:36 -04:00
  • 167b759a38 Fix typos (#2136) hdeng-apple 2025-04-29 22:26:05 +08:00
  • 99b9868859 Clarify dimension notation in conv1d, conv2d, and conv3d docstrings (#2123) charan-003 2025-04-25 13:18:30 -06:00
  • 6b2d5448f2 Fix the error message in mx.right_shift and mx.left_shift (#2121) 1ndig0 2025-04-26 00:14:28 +08:00
  • eaf709b83e patch (#2119) v0.25.1 Awni Hannun 2025-04-24 16:11:07 -07:00
  • f0e70afff0 Fix swift pm load (#2117) Angelos Katharopoulos 2025-04-24 10:58:29 -07:00
  • 86984cad68 Remove static initializers (#2059) hdeng-apple 2025-04-24 21:14:49 +08:00
  • fbc89e3ced fix pinv (#2110) Awni Hannun 2025-04-23 13:08:28 -07:00
  • 38c1e720c2 Search mlx.metallib in macOS framework "Resources" dir (#2061) hdeng-apple 2025-04-24 00:53:13 +08:00
  • 600e87e03c Added output_padding parameters in conv_transpose (#2092) Param Thakkar 2025-04-23 21:56:33 +05:30
  • 3836445241 Add broadcast_shapes in python API (#2091) Hyunsung Lee 2025-04-23 10:57:39 +09:00
  • 1d2c9d6a07 Complex scan (#2094) Yury Popov 2025-04-23 04:56:28 +03:00
  • e8ac6bd2f5 irfft throws instead of segfaults on scalars (#2109) Awni Hannun 2025-04-22 10:25:55 -07:00
  • 11f73d6e89 Double buffer keys for vector sdpa sdpa-test Angelos Katharopoulos 2025-04-22 00:19:11 -07:00
  • fdadc4f22c Add more complex unary ops (#2101) Awni Hannun 2025-04-21 13:04:54 -07:00
  • 79b527f45f conv vmap (#2102) Awni Hannun 2025-04-21 13:04:39 -07:00
  • dc4eada7f0 Use unordered map for kwargs in export/import (#2087) Awni Hannun 2025-04-21 07:17:22 -07:00
  • 70ebc3b598 Return const ref in array::data_shared_ptr (#2100) Cheng 2025-04-21 22:17:09 +08:00
  • b13f2aed16 Introduce macros for dispatching dynamic dtypes as static types (#2073) Cheng 2025-04-19 21:16:30 +08:00
  • 5f04c0f818 Fixed shift operations issue (#2080) Param Thakkar 2025-04-19 02:58:33 +05:30
  • 55935ccae7 fix py gc edge case (#2079) Awni Hannun 2025-04-18 12:46:53 -07:00
  • b529515eb1 minor bump (#2081) v0.25.0 Awni Hannun 2025-04-17 14:57:11 -07:00
  • 3cde719eb7 Route to gather qmm only for many tokens per expert (#2082) Angelos Katharopoulos 2025-04-17 14:53:08 -07:00
  • 5de6d94a90 Gather qmm batched kernel and refactoring of quantized (#2078) Angelos Katharopoulos 2025-04-17 13:53:11 -07:00
  • 4c46e17a5d Update benchmark output steel-refactor Jagrit Digani 2025-04-15 10:50:06 -07:00
  • 99eefd2ec0 Gather mm new kernel and small refactoring (#2040) Angelos Katharopoulos 2025-04-14 16:37:36 -07:00
  • e9e268336b LogCumSumExp (#2069) Yury Popov 2025-04-13 11:27:29 +03:00
  • 7275ac7523 Fix release build (#2072) Awni Hannun 2025-04-12 20:41:58 -07:00
  • c4189a38e4 Add float mask to sdpa vector (#2068) Angelos Katharopoulos 2025-04-11 17:29:40 -07:00
  • 68d1b3256b nit: fix exception handling (#2066) Awni Hannun 2025-04-11 14:12:08 -07:00
  • 9c6953bda7 Fix stubgen (#2065) Awni Hannun 2025-04-11 12:02:54 -07:00
  • ef7ece9851 fix fft bug (#2062) Awni Hannun 2025-04-10 19:41:27 -07:00
  • ddaa4b7dcb Fix the test and add custom min/max reductions for uncommon MPI types (#2060) Angelos Katharopoulos 2025-04-10 17:01:17 -07:00
  • dfae2c6989 Fix MSVC build due to use of M_LN2 (#2058) Cheng 2025-04-10 23:41:41 +09:00
  • 515f104926 Min / max reductions (#2041) Anastasiia Filippova 2025-04-10 08:22:20 +02:00
  • 9ecefd56db Do not load the default lib if another is requested (#2055) Angelos Katharopoulos 2025-04-09 13:31:38 -07:00
  • e5d35aa187 no sdpa in grad (#2054) Awni Hannun 2025-04-08 19:13:54 -07:00
  • 00794c42bc Fix causal mask sdpa vec (#2053) Awni Hannun 2025-04-08 09:11:23 -07:00
  • 08a1bf3f10 Remove Event::Signal() (#2052) Cheng 2025-04-08 22:20:27 +09:00
  • 60c4154346 Only request residency once (#2051) Awni Hannun 2025-04-07 10:47:51 -07:00
  • f2c85308c1 add a half simd gemm fallback (#2046) Awni Hannun 2025-04-07 09:31:29 -07:00
  • 1a28b69ee2 only add to residency set once (#2049) Awni Hannun 2025-04-06 17:38:25 -07:00
  • ba09f01ce8 Remove test of converting negative float to uint (#2048) Cheng 2025-04-06 22:21:46 +09:00
  • 6cf48872b7 wait_for_one should wait for task to finish (#2047) Cheng 2025-04-06 12:05:16 +09:00
  • 7b3b8fa000 Fix ci release (#2045) Angelos Katharopoulos 2025-04-04 20:25:01 -07:00
  • ec5e2aae61 nit in doc (#2044) Awni Hannun 2025-04-04 12:04:17 -07:00
  • 86389bf970 patch bump (#2043) v0.24.2 Awni Hannun 2025-04-03 13:15:18 -07:00
  • 3290bfa690 Add new sdpa function overload (#2035) Jagrit Digani 2025-04-03 11:58:28 -07:00
  • 066336b60e load q4_k from gguf gguf_q4_k Awni Hannun 2025-04-03 10:56:12 -07:00
  • 8777fd104f Depthwise Conv2D optimization (#2036) Jagrit Digani 2025-04-03 09:42:04 -07:00
  • c41f7565ed fix softmax / logsumexp (#2042) Awni Hannun 2025-04-03 08:32:59 -07:00
  • 9ba81e3da4 tune quant dispatch (#2031) Awni Hannun 2025-04-02 20:05:54 -07:00
  • c23888acd7 Fix build warning (#2033) Awni Hannun 2025-04-01 14:42:27 -07:00
  • f98ce25ab9 fix residency set for real (#2032) Awni Hannun 2025-04-01 12:59:48 -07:00
  • de5f38fd48 Custom logsumexp (#2028) Awni Hannun 2025-03-31 07:36:55 -07:00
  • ec2854b13a Swap -inf for finite_minimum value (#2029) Angelos Katharopoulos 2025-03-30 21:55:04 -07:00
  • 90823d2938 Add missing funcs to docs (#2021) Stephen Panaro 2025-03-30 21:29:33 -04:00
  • 5f5770e3a2 Fix CPU sign for unsigned ints (#2024) Jesper Stemann Andersen 2025-03-31 02:56:59 +02:00
  • 28f39e9038 Log for complex numbers in Metal (#2025) Awni Hannun 2025-03-30 17:04:38 -07:00
  • b2d2b37888 fix residency set clearing (#2027) Awni Hannun 2025-03-30 16:27:26 -07:00
  • fe597e141c add pinv to doc (#2020) Awni Hannun 2025-03-30 15:54:18 -07:00
  • 72ca1539e0 Remove unused variable in /setup.py (#2026) Yi Wang 2025-03-30 12:52:33 -07:00
  • 13b26775f1 use minimum deployment target (#2016) Awni Hannun 2025-03-28 14:31:53 -07:00
  • 05d7118561 causal vector sdpa (#2018) Awni Hannun 2025-03-28 12:36:13 -07:00
  • 98b901ad66 enable complex gemm (#2017) Awni Hannun 2025-03-28 10:45:13 -07:00
  • 5580b47291 iinfo and scalar overflow detection (#2009) Awni Hannun 2025-03-27 19:54:56 -07:00
  • bc62932984 sdpa specialization for head dim 256 (#2007) Awni Hannun 2025-03-27 19:31:25 -07:00
  • a6b5d6e759 revise cmake minimum for doctest (#2014) Awni Hannun 2025-03-27 19:30:58 -07:00
  • a8931306e1 Remove unused variable in CMakeBuild (#2011) Yi Wang 2025-03-27 16:00:51 -07:00
  • fecdb8717e Polish CONTRIBUTING>md (#2005) Yi Wang 2025-03-25 19:06:34 -07:00
  • 916fd273ea wire cache (#2006) Awni Hannun 2025-03-25 18:54:01 -07:00
  • 0da8506552 Update docs for extensions (#2004) Yi Wang 2025-03-25 18:35:03 -07:00