Angelos Katharopoulos
3fe98bacc7
Raw sockets
2024-09-09 00:38:15 -07:00
Angelos Katharopoulos
2e267bd6a8
Fix run without distributed
2024-09-06 11:03:01 -07:00
Angelos Katharopoulos
a9746587f1
TCP socket distributed
2024-09-06 11:03:01 -07:00
Angelos Katharopoulos
97a9561e34
Change the send message size
2024-09-06 11:03:01 -07:00
Angelos Katharopoulos
4aefacf0bc
Make it work even for donated inputs
2024-09-06 11:03:01 -07:00
Angelos Katharopoulos
794a48a9f6
Start a sockets based distributed backend
2024-09-06 11:03:01 -07:00
Awni Hannun
7cca1727af
Fix slice data size ( #1394 )
...
* fix slice data size and add tests
* fix contiguous flag
* simplify stride and perform copy for non-contiguous arrays
* fix cpu
* comment
2024-09-04 19:10:43 -07:00
Bhargav Yagnik
11371fe251
Test to prevent bugs like #1386 ( #1391 )
...
* updated test_array for missing ops
* formatting changes
2024-09-04 17:24:30 -07:00
Awni Hannun
41c603d48a
fix jit reduce ( #1395 )
2024-09-04 14:03:10 -07:00
Angelos Katharopoulos
969337345f
Fix reduce edge case ( #1389 )
2024-09-01 21:37:51 -07:00
Awni Hannun
9592766939
add std as method ( #1387 )
...
* add std as method
* add std as method
2024-09-01 19:49:16 -07:00
Angelos Katharopoulos
58dca7d846
Fix copy in the sort primitive ( #1383 )
2024-08-31 08:32:14 -07:00
Awni Hannun
0d302cd25b
Fix compiel with byte sized constants ( #1381 )
2024-08-30 17:24:35 -07:00
Alex Barron
da691257ec
Fix overflow in quantize/dequantize ( #1379 )
...
* add 2d indices to prevent overflow
* use nthreads not out size
2024-08-30 13:32:41 -07:00
Angelos Katharopoulos
1600092e92
Patch bump ( #1376 )
2024-08-29 16:54:30 -07:00
Awni Hannun
dba2bd1105
Even Even Faster IO ( #1374 )
...
* even more faster io
* make reader pool static
* make python reader thread safe
* one more optimization
2024-08-29 16:05:40 -07:00
Alex Barron
28be4de7c2
Fix JIT reductions ( #1373 )
2024-08-28 16:39:11 -07:00
Awni Hannun
a6c3b38fba
Async load ( #1372 )
...
* async load
* async load
2024-08-28 14:21:55 -07:00
Awni Hannun
fcb65a3897
Even Faster I/O ( #1369 )
...
* try multithreading for faster IO
* smaller batch size
* Account for pread returning less than size
* nit
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-08-28 11:49:07 -07:00
Saanidhya
4e22a1dffe
In continuation to PR1243 to solve issue #1240 ( #1365 )
...
* Solves issue #1240
* Correction
* Update python/mlx/utils.py
* Update python/mlx/utils.py
---------
Co-authored-by: Awni Hannun <awni@apple.com>
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-08-28 11:40:41 -07:00
Awni Hannun
291cf40aca
Some fixes to typing ( #1371 )
...
* some fixes to typing
* fix module reference
* comment
2024-08-28 11:16:19 -07:00
Jeethu Rao
bd47e1f066
Fix neon_fast_exp and add more softmax tests ( #1367 )
2024-08-27 23:42:42 -07:00
Aditya Dhulipala
e6b223df5f
Pinv ( #875 )
2024-08-27 23:06:12 -07:00
Angelos Katharopoulos
e64349bbdd
Make eval just wait if all arrays are scheduled ( #1368 )
2024-08-27 17:01:22 -07:00
Angelos Katharopoulos
cdb59faea6
Adds send/recv ops in distributed ( #1366 )
2024-08-26 23:01:37 -07:00
Alex Barron
1d94ac3f90
Add optional headers to `mx.fast.metal_kernel
` ( #1358 )
2024-08-26 21:45:45 -07:00
Awni Hannun
5f7d19d1f5
MPI ops in GPU stream for faster comms ( #1356 )
2024-08-26 15:12:50 -07:00
Awni Hannun
2fdf9eb535
Fix ternary for large arrays ( #1359 )
...
* fix ternary for large arrays
* fix
2024-08-26 11:22:27 -07:00
Awni Hannun
860d3a50d7
fix extension metal library finding ( #1361 )
2024-08-26 09:18:50 -07:00
Alex Barron
d1183821a7
int() and float() for mx.array ( #1360 )
2024-08-25 20:41:44 -07:00
Angelos Katharopoulos
8081df79be
Fix boolean all reduce bug ( #1355 )
2024-08-24 10:09:32 -07:00
Nripesh Niketan
64bec4fad7
Chore: update pre-commit hooks ( #1353 )
...
* Chore: update pre-commit refs
* run pre-commit
2024-08-24 06:46:36 -07:00
Alex Barron
b96e105244
Add grid_sample
example to metal_kernel
docs ( #1352 )
...
* Add `zero_outputs` and `atomic_outputs` options to `metal_kernel`
* add grid sample to docs
* zero_outputs -> init_value
* add missing header for linux
2024-08-23 18:24:16 -07:00
Awni Hannun
3b4d5484c7
Bump extension MLX version ( #1350 )
...
* Bump extension MLX version
* fix some docs nits
2024-08-23 12:38:34 -07:00
Alex Barron
684e11c664
patch ( #1347 )
2024-08-23 10:42:02 -07:00
Angelos Katharopoulos
b57a52813b
Further reduction tuning ( #1349 )
...
* More reduction tuning
* Forgotten pdb
* Small column long row specialization
2024-08-23 10:35:25 -07:00
Alex Barron
da8deb2b62
fix bug with multiple attributes ( #1348 )
...
Co-authored-by: Alex Barron <abarron22@apple.com>
2024-08-23 10:06:15 -07:00
Awni Hannun
98b6ce3460
Refactor reductions and fix scatter atomics for large sizes ( #1300 )
...
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-08-22 16:03:31 -07:00
Awni Hannun
f9e00efe31
fix nanobind and stub gen in circle ( #1346 )
2024-08-22 14:07:27 -07:00
Alex Barron
0fd2a1f4b0
Custom Metal Kernels from Python ( #1325 )
...
* start
* simple kernels working
* restructure
* inverse example working
* docs + fixes
* missing file
* fix imports
* address comments
* add docs + fix test
* Review comments + refactor to a single function
* update docs
* remove hashing
* fix contig bug in test
* back to a class
* trailing whitespace
* fix tests
* match c++ and python apis
* add link + make args kw_only
2024-08-22 13:46:29 -07:00
Awni Hannun
df3233454d
2d gather specialization ( #1339 )
2024-08-22 10:48:24 -07:00
Awni Hannun
82db84b899
bump nanobind + fix extension ( #1344 )
2024-08-21 16:05:07 -07:00
Awni Hannun
8ae751d3da
fix io ( #1343 )
...
* fix io
* fix io
* comment
2024-08-21 13:14:46 -07:00
Awni Hannun
d40e76809f
Fix rope ( #1340 )
...
* add test
* fix rope
* fix test
2024-08-20 17:37:52 -07:00
Awni Hannun
bb1b76d9dc
RoPE with frequencies as optional input ( #1337 )
...
* start rope with freq input
* rope with frequencies
* nits
* fix bug
* fix bug + test
* cleanup
* optional base
2024-08-19 18:30:50 -07:00
Angelos Katharopoulos
9d26441224
Fix contiguity check ( #1336 )
...
Co-authored-by: Alex Barron <abarron22@apple.com>
2024-08-19 16:05:06 -07:00
Awni Hannun
f12f24a77c
fix compiling with space in paths ( #1332 )
2024-08-15 16:39:24 -07:00
Awni Hannun
ae5b5cabfd
Fix optimizer reloading from checkpoint ( #1329 )
...
* fix optimizer reloading from checkpoint
* comment
2024-08-15 07:33:23 -07:00
Awni Hannun
d0630ffe8c
Read arrays from files faster ( #1330 )
...
* read faster
* faster write as well
* set default permission for linux
* comment
2024-08-14 20:09:56 -07:00
Alex Barron
99bb7d3a58
GPU mx.sign for complex64 ( #1326 )
2024-08-14 07:54:53 -07:00