Awni Hannun
|
61d787726a
|
Fix view scalar bug segfault (#1603)
* fix view scalar bug
* fix view scalar bug
* one more fix
|
2024-11-19 10:54:05 -08:00 |
|
Angelos Katharopoulos
|
5e89aace9b
|
Fix concatenate vmap (#1600)
|
2024-11-19 10:44:04 -08:00 |
|
Awni Hannun
|
2af7e8a9a6
|
fix cmake version (#1601)
|
2024-11-19 08:45:05 -08:00 |
|
Awni Hannun
|
2419edd5b2
|
Faster indexing math in a few kernels (#1589)
* wip: faster compiled kernels
* faster general unary with uint specialization
* index type in compiled, unary, binary, ternary, copy
* fix jit
* jit fix
* specialize gather + scatter
* nit in docs
|
2024-11-18 19:52:00 -08:00 |
|
Awni Hannun
|
bf481e8e5d
|
Fix sibling leak (#1590)
* add test
* fix + test
* fix fix
|
2024-11-18 19:17:01 -08:00 |
|
Awni Hannun
|
9d7fa6b8e6
|
Use osx deployment target to pick Metal version (#1595)
* choose metal based on deployment target rather than system version
* nit
* unused compile def
|
2024-11-18 19:16:49 -08:00 |
|
Angelos Katharopoulos
|
073076ac7d
|
2-Pass Sdpa Inference Kernel (#1597)
|
2024-11-18 17:31:53 -08:00 |
|
Awni Hannun
|
9bd03dd9b4
|
More buffer donation with no-ops (#1591)
* more donation
* fix test
* fix build
|
2024-11-18 08:35:41 -08:00 |
|
Awni Hannun
|
6931f84412
|
fix dispatch threads for a few kernels (#1594)
|
2024-11-18 08:35:25 -08:00 |
|
xnorai
|
16ec0556a0
|
Allocate raw JSON metadata buffer on the heap, and limit its size (#1596)
* Allocate raw JSON metadata buffer on the heap, and limit its size to 1GiB
* Set the upper size limit for the header to 100K as in Rust safetensors
|
2024-11-18 07:22:51 -08:00 |
|
Awni Hannun
|
610af352d4
|
Dispatch bf16 at run time when using the JIT (#1584)
* Dispatch bf16 at run time when using the JIT
* fix extension
* fix extension build
* fix extension build
* Update utils.h
|
2024-11-15 16:54:36 -08:00 |
|
Awni Hannun
|
b35f1e3c9c
|
fix donation in sdpa (#1587)
|
2024-11-13 17:21:13 -08:00 |
|
Awni Hannun
|
dfa0b9aab4
|
Cpu fast quantize (#1578)
* cpu quantize
* fix
|
2024-11-08 20:10:39 -08:00 |
|
Alex Barron
|
a4c47b0276
|
OOB QMV fix (#1579)
* fix oob access in qmv
* skip more
* fix small case
|
2024-11-08 17:59:45 -08:00 |
|
Alex Barron
|
111fefd5e9
|
Fix OOB access in qmv (#1577)
* fix oob access in qmv
* skip more
|
2024-11-08 15:41:30 -08:00 |
|
Awni Hannun
|
c1fe1ef081
|
Bfs width limit (#1568)
* width limit
* fix
* large limit
* put env vars in env namespace
|
2024-11-08 15:00:46 -08:00 |
|
Awni Hannun
|
8c34c9dac4
|
throw for invalid case and remove test (#1575)
|
2024-11-08 12:04:03 -08:00 |
|
Awni Hannun
|
91c0277356
|
fix per-example mask + docs in sdpa (#1574)
|
2024-11-08 11:51:15 -08:00 |
|
Awni Hannun
|
9f0d5c12fc
|
Fully wrap the command encoder (#1572)
* fully wrap the command encoder
* use consistent style + fix extensions
|
2024-11-08 11:50:21 -08:00 |
|
Awni Hannun
|
59247c2b62
|
add groups in conv2d (#1569)
|
2024-11-07 13:57:53 -08:00 |
|
Awni Hannun
|
9a3842a2d9
|
fix (#1566)
|
2024-11-06 17:10:33 -08:00 |
|
Alex Barron
|
726dbd9267
|
v0.20.0 (#1565)
v0.20.0
|
2024-11-05 12:37:57 -08:00 |
|
Awni Hannun
|
54f05e7195
|
Fix gather vmap (#1563)
* fix gather
* fix
|
2024-11-05 11:29:20 -08:00 |
|
Alex Barron
|
26be608470
|
Add split_k qvm for long context (#1564)
* Add splitk qvm
* configurable splitk
* tuning
* remove extra instantiation
* remove refactor
* separate test
* cpu tolerance
|
2024-11-05 11:25:19 -08:00 |
|
Angelos Katharopoulos
|
248431eb3c
|
Reductions update (#1351)
|
2024-11-04 22:25:16 -08:00 |
|
Awni Hannun
|
76f275b4df
|
error in rms for wrong size (#1562)
|
2024-11-04 13:24:02 -08:00 |
|
Awni Hannun
|
f1951d6cce
|
Use fewer barriers (#1561)
* use fewer barriers
* comment
|
2024-11-04 10:26:49 -08:00 |
|
Angelos Katharopoulos
|
62f297b51d
|
Sdpa fix (#1558)
|
2024-11-02 21:25:46 -07:00 |
|
Awni Hannun
|
09bc32f62f
|
No extra reshape (#1557)
* no extra reshape
* lint
|
2024-11-02 19:07:20 -07:00 |
|
Chris Offner
|
46d8b16ab4
|
Fix vmap example in docs (#1556)
|
2024-11-02 17:44:14 -07:00 |
|
Chris Offner
|
42533931fa
|
Fix typo "it's" -> "its" (#1555)
|
2024-11-02 06:06:34 -07:00 |
|
Awni Hannun
|
9bd3a7102f
|
add python 3.13 to circle (#1553)
|
2024-11-01 20:55:35 -07:00 |
|
Alex Barron
|
9e516b71ea
|
Add dispatchThreads to custom kernel doc (#1551)
* add dispatchThreads info
* update
* add link
|
2024-11-01 13:07:48 -07:00 |
|
Awni Hannun
|
eac961ddb1
|
patch (#1550)
v0.19.3
|
2024-10-31 16:10:14 -07:00 |
|
Awni Hannun
|
57c6aa7188
|
fix multi output leak (#1548)
|
2024-10-31 09:32:01 -07:00 |
|
Awni Hannun
|
cde5b4ad80
|
patch (#1546)
v0.19.2
|
2024-10-30 19:31:22 -07:00 |
|
Awni Hannun
|
4f72c66911
|
improvements to scatter / gather (#1541)
|
2024-10-30 19:30:54 -07:00 |
|
Jagrit Digani
|
960e3f0f05
|
Gemm update (#1518)
|
2024-10-30 19:30:28 -07:00 |
|
Awni Hannun
|
884af42da2
|
Fix thread group for large arrays (#1543)
* fix thread group for large arrays
* comment
* one more
|
2024-10-30 16:25:12 -07:00 |
|
Alex Barron
|
048fabdabd
|
Fix vmap constant output size (#1524)
* use inputs to determine output size
* remove noop vmap tests
|
2024-10-30 16:16:53 -07:00 |
|
Léo
|
917252a5a1
|
Add favicon to docs (#1545)
* add sphinx's html_favicon config
* removed unneeded newline
* ran pre-commit hooks
|
2024-10-30 13:54:13 -07:00 |
|
Carlo Cabrera
|
1a992e31e8
|
Skip using Residency sets in VMs (#1537)
* Skip using Residency sets in VMs
Attempting to use residency sets in a VM throws[^1]
libc++abi: terminating due to uncaught exception of type std::runtime_error: [metal::Device] Unable to construct residency set.
Not quite sure if this is the best fix, but it does make the error go
away.
Note that it was previously possible to run simple programs that used
mlx in a VM prior to 0eb56d5be0 . See
related discussion at Homebrew/homebrew-core#195627.
[^1]: https://github.com/Homebrew/homebrew-core/actions/runs/11525831492/job/32105148462#step:3:56
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* change residency check
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
|
2024-10-29 19:37:23 -07:00 |
|
Awni Hannun
|
d2ff04a4f2
|
fix format (#1539)
|
2024-10-28 18:29:14 -07:00 |
|
Awni Hannun
|
015c247393
|
change wino dispatch conditoin (#1534)
|
2024-10-28 11:13:44 -07:00 |
|
Awni Hannun
|
d3cd26820e
|
Faster bits and bernoulli (#1535)
* faster bits and bernoulli
* fix bernoulli
|
2024-10-28 11:11:00 -07:00 |
|
Awni Hannun
|
91f6c499d7
|
fix (#1529)
|
2024-10-25 19:25:35 -07:00 |
|
Awni Hannun
|
35e9c87ab9
|
patch bump (#1528)
v0.19.1
|
2024-10-25 13:13:23 -07:00 |
|
Awni Hannun
|
8e88e30d95
|
BFS graph evaluation order (#1525)
* bfs order
* try fix event issue
|
2024-10-25 10:27:19 -07:00 |
|
Awni Hannun
|
0eb56d5be0
|
Wired (#1510)
* expose residency sets as wire/unwire
* returns wired size
* fix
* runtime support check
* fix os check
* fix test
* fix no metal build
* docs
* nit
* nits in docs
* nits
|
2024-10-25 09:35:33 -07:00 |
|
Paul Hansel
|
f70764a162
|
Fix typo in build docs (#1522)
|
2024-10-24 20:55:06 -07:00 |
|