Ronan Collobert
87b680766e
Gloo backend support
2024-11-13 13:52:37 -08:00
Ronan Collobert
70ffaa50d2
be more relaxed on OpenMPI version
2024-11-13 13:51:37 -08:00
Angelos Katharopoulos
d82699f0f1
Merge branch 'distributed-layers' into socket-distributed-layers
2024-11-05 11:36:16 -08:00
Angelos Katharopoulos
6fc00d2c10
Add rudimentary barrier
2024-11-05 11:34:55 -08:00
Angelos Katharopoulos
44f0de2854
Fix run without distributed
2024-11-05 11:27:41 -08:00
Angelos Katharopoulos
29ec3539ed
TCP socket distributed
2024-11-05 11:27:41 -08:00
Angelos Katharopoulos
e94f0028c3
Change the send message size
2024-11-05 11:27:41 -08:00
Angelos Katharopoulos
e5354fcddb
Make it work even for donated inputs
2024-11-05 11:27:41 -08:00
Angelos Katharopoulos
34dd079a64
Start a sockets based distributed backend
2024-11-05 11:27:41 -08:00
Angelos Katharopoulos
16975815e9
Fixes in distributed layers
2024-11-05 11:27:26 -08:00
Angelos Katharopoulos
a8b3da7946
Add distributed layers to nn top-level
2024-11-05 11:27:26 -08:00
Angelos Katharopoulos
060e1c9f92
Add quantized distributed layers
2024-11-05 11:27:26 -08:00
Angelos Katharopoulos
0b04742985
Add the distributed linear layers
2024-11-05 11:27:26 -08:00
Angelos Katharopoulos
c3ccd4919f
Add MPI barrier
2024-11-05 11:26:53 -08:00
Alex Barron
26be608470
Add split_k qvm
for long context ( #1564 )
...
* Add splitk qvm
* configurable splitk
* tuning
* remove extra instantiation
* remove refactor
* separate test
* cpu tolerance
2024-11-05 11:25:19 -08:00
Angelos Katharopoulos
248431eb3c
Reductions update ( #1351 )
2024-11-04 22:25:16 -08:00
Awni Hannun
76f275b4df
error in rms for wrong size ( #1562 )
2024-11-04 13:24:02 -08:00
Awni Hannun
f1951d6cce
Use fewer barriers ( #1561 )
...
* use fewer barriers
* comment
2024-11-04 10:26:49 -08:00
Angelos Katharopoulos
62f297b51d
Sdpa fix ( #1558 )
2024-11-02 21:25:46 -07:00
Awni Hannun
09bc32f62f
No extra reshape ( #1557 )
...
* no extra reshape
* lint
2024-11-02 19:07:20 -07:00
Chris Offner
46d8b16ab4
Fix vmap example in docs ( #1556 )
2024-11-02 17:44:14 -07:00
Chris Offner
42533931fa
Fix typo "it's" -> "its" ( #1555 )
2024-11-02 06:06:34 -07:00
Awni Hannun
9bd3a7102f
add python 3.13 to circle ( #1553 )
2024-11-01 20:55:35 -07:00
Alex Barron
9e516b71ea
Add dispatchThreads to custom kernel doc ( #1551 )
...
* add dispatchThreads info
* update
* add link
2024-11-01 13:07:48 -07:00
Awni Hannun
eac961ddb1
patch ( #1550 )
2024-10-31 16:10:14 -07:00
Awni Hannun
57c6aa7188
fix multi output leak ( #1548 )
2024-10-31 09:32:01 -07:00
Awni Hannun
cde5b4ad80
patch ( #1546 )
2024-10-30 19:31:22 -07:00
Awni Hannun
4f72c66911
improvements to scatter / gather ( #1541 )
2024-10-30 19:30:54 -07:00
Jagrit Digani
960e3f0f05
Gemm update ( #1518 )
2024-10-30 19:30:28 -07:00
Awni Hannun
884af42da2
Fix thread group for large arrays ( #1543 )
...
* fix thread group for large arrays
* comment
* one more
2024-10-30 16:25:12 -07:00
Alex Barron
048fabdabd
Fix vmap constant output size ( #1524 )
...
* use inputs to determine output size
* remove noop vmap tests
2024-10-30 16:16:53 -07:00
Léo
917252a5a1
Add favicon to docs ( #1545 )
...
* add sphinx's html_favicon config
* removed unneeded newline
* ran pre-commit hooks
2024-10-30 13:54:13 -07:00
Carlo Cabrera
1a992e31e8
Skip using Residency sets in VMs ( #1537 )
...
* Skip using Residency sets in VMs
Attempting to use residency sets in a VM throws[^1]
libc++abi: terminating due to uncaught exception of type std::runtime_error: [metal::Device] Unable to construct residency set.
Not quite sure if this is the best fix, but it does make the error go
away.
Note that it was previously possible to run simple programs that used
mlx in a VM prior to 0eb56d5be0
. See
related discussion at Homebrew/homebrew-core#195627 .
[^1]: https://github.com/Homebrew/homebrew-core/actions/runs/11525831492/job/32105148462#step:3:56
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* change residency check
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-10-29 19:37:23 -07:00
Awni Hannun
d2ff04a4f2
fix format ( #1539 )
2024-10-28 18:29:14 -07:00
Awni Hannun
015c247393
change wino dispatch conditoin ( #1534 )
2024-10-28 11:13:44 -07:00
Awni Hannun
d3cd26820e
Faster bits and bernoulli ( #1535 )
...
* faster bits and bernoulli
* fix bernoulli
2024-10-28 11:11:00 -07:00
Awni Hannun
91f6c499d7
fix ( #1529 )
2024-10-25 19:25:35 -07:00
Awni Hannun
35e9c87ab9
patch bump ( #1528 )
2024-10-25 13:13:23 -07:00
Awni Hannun
8e88e30d95
BFS graph evaluation order ( #1525 )
...
* bfs order
* try fix event issue
2024-10-25 10:27:19 -07:00
Awni Hannun
0eb56d5be0
Wired ( #1510 )
...
* expose residency sets as wire/unwire
* returns wired size
* fix
* runtime support check
* fix os check
* fix test
* fix no metal build
* docs
* nit
* nits in docs
* nits
2024-10-25 09:35:33 -07:00
Paul Hansel
f70764a162
Fix typo in build docs ( #1522 )
2024-10-24 20:55:06 -07:00
Awni Hannun
dad1b00b13
fix ( #1523 )
2024-10-24 19:17:46 -07:00
Venkata Naga Aditya Datta Chivukula
430ffef58a
[Feature] Added Sparse Initialization ( #1498 )
...
Co-authored-by: Saanidhyavats <saanidhyavats@gmail.com>
2024-10-24 12:31:24 -07:00
Alex Barron
3d17077187
Add mx.array.__format__ ( #1521 )
...
* add __format__
* actually test something
* fix
2024-10-24 11:11:39 -07:00
Angelos Katharopoulos
c9b41d460f
Working 64-bit scans ( #1506 )
2024-10-24 11:05:46 -07:00
xnorai
32972a5924
C++20 compatibility for fmt ( #1519 )
...
* C++20 compatibility for fmt
* Address review feedback
* Remove stray string
* Add newlines back
2024-10-24 08:54:51 -07:00
Dhruv Govil
f6afb9c09b
Remove use of vector<const T> ( #1514 )
2024-10-22 16:31:52 -07:00
Kashif Rasul
3ddc07e936
Eigenvalues and eigenvectors ( #1334 )
...
* initial eigvalsh
* add compute_vectors
* add compute_vectors_
* return a pair
* add eigh to return only eigenvectors
* fixed typo
* merge merge Eighvalsh and Eigh into a single primitive
* use the same primate with the flag
* fix primatives
* use MULTI
* fix eval_gpu
* fix decleration
* rename EighPrimitive to Eigh
* tests
* tests
* fix rebase and format
* cleanup lapack
* format
* add cblas.h
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-10-22 12:18:48 -07:00
Awni Hannun
c26208f67d
Remove Hazard tracking with Fences ( #1509 )
...
* remove hazard tracking
* with fence map
* no hazard tracking with fences
* nits
* fix fence retain
* cleanup
* fix quantized rebase
2024-10-21 19:33:32 -07:00
Alex Barron
d15fa13daf
Batched Quantized Matmul + Fast Small QMV ( #1503 )
...
* add fast qmv for small dims
* fix test
* batched cpu
* add batched template param
* refactor metal quantized.cpp
2024-10-21 16:23:17 -07:00