Angelos Katharopoulos
|
11f73d6e89
|
Double buffer keys for vector sdpa
|
2025-04-22 00:19:11 -07:00 |
|
Awni Hannun
|
fdadc4f22c
|
Add more complex unary ops (#2101)
|
2025-04-21 13:04:54 -07:00 |
|
Awni Hannun
|
79b527f45f
|
conv vmap (#2102)
|
2025-04-21 13:04:39 -07:00 |
|
Awni Hannun
|
dc4eada7f0
|
Use unordered map for kwargs in export/import (#2087)
* use unordered map for kwargs in export/import
* comment
|
2025-04-21 07:17:22 -07:00 |
|
Cheng
|
70ebc3b598
|
Return const ref in array::data_shared_ptr (#2100)
|
2025-04-21 07:17:09 -07:00 |
|
Cheng
|
b13f2aed16
|
Introduce macros for dispatching dynamic dtypes as static types (#2073)
|
2025-04-19 06:16:30 -07:00 |
|
Param Thakkar
|
5f04c0f818
|
Fixed shift operations issue (#2080)
* Fixed shift operations issue
* Added tests and fixes
* Fixed loop syntax error
* Added tests for bool
* Fixed typo
|
2025-04-18 14:28:33 -07:00 |
|
Awni Hannun
|
55935ccae7
|
fix py gc edge case (#2079)
|
2025-04-18 12:46:53 -07:00 |
|
Awni Hannun
|
b529515eb1
|
minor bump (#2081)
|
2025-04-17 14:57:11 -07:00 |
|
Angelos Katharopoulos
|
3cde719eb7
|
Route to gather qmm only for many tokens per expert (#2082)
|
2025-04-17 14:53:08 -07:00 |
|
Angelos Katharopoulos
|
5de6d94a90
|
Gather qmm batched kernel and refactoring of quantized (#2078)
|
2025-04-17 13:53:11 -07:00 |
|
Angelos Katharopoulos
|
99eefd2ec0
|
Gather mm new kernel and small refactoring (#2040)
|
2025-04-14 16:37:36 -07:00 |
|
Yury Popov
|
e9e268336b
|
LogCumSumExp (#2069)
|
2025-04-13 01:27:29 -07:00 |
|
Awni Hannun
|
7275ac7523
|
Fix release build (#2072)
|
2025-04-12 20:41:58 -07:00 |
|
Angelos Katharopoulos
|
c4189a38e4
|
Add float mask to sdpa vector (#2068)
|
2025-04-11 17:29:40 -07:00 |
|
Awni Hannun
|
68d1b3256b
|
nit: fix exception handling (#2066)
|
2025-04-11 14:12:08 -07:00 |
|
Awni Hannun
|
9c6953bda7
|
Fix stubgen (#2065)
* Fix stubgen
* add multi optim to docs
|
2025-04-11 12:02:54 -07:00 |
|
Awni Hannun
|
ef7ece9851
|
fix fft bug (#2062)
|
2025-04-10 19:41:27 -07:00 |
|
Angelos Katharopoulos
|
ddaa4b7dcb
|
Fix the test and add custom min/max reductions for uncommon MPI types (#2060)
|
2025-04-10 17:01:17 -07:00 |
|
Cheng
|
dfae2c6989
|
Fix MSVC build due to use of M_LN2 (#2058)
|
2025-04-10 07:41:41 -07:00 |
|
Anastasiia Filippova
|
515f104926
|
Min / max reductions (#2041)
|
2025-04-09 23:22:20 -07:00 |
|
Angelos Katharopoulos
|
9ecefd56db
|
Do not load the default lib if another is requested (#2055)
|
2025-04-09 13:31:38 -07:00 |
|
Awni Hannun
|
e5d35aa187
|
no sdpa in grad (#2054)
|
2025-04-08 19:13:54 -07:00 |
|
Awni Hannun
|
00794c42bc
|
Fix causal mask sdpa vec (#2053)
* fix sdpa vector causal mask
* test
|
2025-04-08 09:11:23 -07:00 |
|
Cheng
|
08a1bf3f10
|
Remove Event::Signal() (#2052)
|
2025-04-08 06:20:27 -07:00 |
|
Awni Hannun
|
60c4154346
|
Only request residency once (#2051)
|
2025-04-07 10:47:51 -07:00 |
|
Awni Hannun
|
f2c85308c1
|
add a half simd gemm fallback (#2046)
* add a half simd gemm fallback
* nit
|
2025-04-07 09:31:29 -07:00 |
|
Awni Hannun
|
1a28b69ee2
|
only add to residency set once (#2049)
|
2025-04-06 17:38:25 -07:00 |
|
Cheng
|
ba09f01ce8
|
Remove test of converting negative float to uint (#2048)
|
2025-04-06 06:21:46 -07:00 |
|
Cheng
|
6cf48872b7
|
wait_for_one should wait for task to finish (#2047)
|
2025-04-05 20:05:16 -07:00 |
|
Angelos Katharopoulos
|
7b3b8fa000
|
Fix ci release (#2045)
|
2025-04-04 20:25:01 -07:00 |
|
Awni Hannun
|
ec5e2aae61
|
nit in doc (#2044)
|
2025-04-04 12:04:17 -07:00 |
|
Awni Hannun
|
86389bf970
|
patch bump (#2043)
|
2025-04-03 13:15:18 -07:00 |
|
Jagrit Digani
|
3290bfa690
|
Add new sdpa function overload (#2035)
* Add new sdpa function overload
* Address comments
* Remove std::varaint from cpp sdpa function
|
2025-04-03 11:58:28 -07:00 |
|
Jagrit Digani
|
8777fd104f
|
Depthwise Conv2D optimization (#2036)
- Add new specialized kernel for small kernel (kernels size <= 7), small strides (strides <= 2) depthwise 2d convolutions
- Add related tests
|
2025-04-03 09:42:04 -07:00 |
|
Awni Hannun
|
c41f7565ed
|
fix softmax / logsumexp (#2042)
|
2025-04-03 08:32:59 -07:00 |
|
Awni Hannun
|
9ba81e3da4
|
tune quant dispatch (#2031)
|
2025-04-02 20:05:54 -07:00 |
|
Awni Hannun
|
c23888acd7
|
Fix build warning (#2033)
|
2025-04-01 14:42:27 -07:00 |
|
Awni Hannun
|
f98ce25ab9
|
fix residency set for real (#2032)
|
2025-04-01 12:59:48 -07:00 |
|
Awni Hannun
|
de5f38fd48
|
Custom logsumexp (#2028)
* initial custom logsumexp
* more tests
* comments + fix
|
2025-03-31 07:36:55 -07:00 |
|
Angelos Katharopoulos
|
ec2854b13a
|
Swap -inf for finite_minimum value (#2029)
|
2025-03-30 21:55:04 -07:00 |
|
Stephen Panaro
|
90823d2938
|
Add missing funcs to docs (#2021)
|
2025-03-30 18:29:33 -07:00 |
|
Jesper Stemann Andersen
|
5f5770e3a2
|
Fix CPU sign for unsigned ints (#2024)
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
|
2025-03-30 17:56:59 -07:00 |
|
Awni Hannun
|
28f39e9038
|
Log for complex numbers in Metal (#2025)
* Log for complex numbers in Metal
* fix log2
|
2025-03-30 17:04:38 -07:00 |
|
Awni Hannun
|
b2d2b37888
|
fix residency set clearing (#2027)
|
2025-03-30 16:27:26 -07:00 |
|
Awni Hannun
|
fe597e141c
|
add pinv to doc (#2020)
|
2025-03-30 15:54:18 -07:00 |
|
Yi Wang
|
72ca1539e0
|
Remove unused variable in /setup.py (#2026)
This is a follow up of https://github.com/ml-explore/mlx/pull/2011
|
2025-03-30 12:52:33 -07:00 |
|
Awni Hannun
|
13b26775f1
|
use minimum deployment target (#2016)
|
2025-03-28 14:31:53 -07:00 |
|
Awni Hannun
|
05d7118561
|
causal vector sdpa (#2018)
* causal vector sdpa
* get rid of memory threshold
|
2025-03-28 12:36:13 -07:00 |
|
Awni Hannun
|
98b901ad66
|
enable complex gemm (#2017)
|
2025-03-28 10:45:13 -07:00 |
|