Awni Hannun
92d7cb71f8
Fix compile ( #1501 )
...
* fix compile
* fix space
2024-10-18 11:06:40 -07:00
Angelos Katharopoulos
50d8bed468
Fused attention for single query ( #1497 )
2024-10-18 00:58:52 -07:00
Awni Hannun
9dd72cd421
fix gumbel ( #1495 )
2024-10-17 13:52:39 -07:00
Awni Hannun
343aa46b78
No more 3.8 ( #1493 )
2024-10-16 17:51:38 -07:00
Awni Hannun
b8ab89b413
Docs in ci ( #1491 )
...
* docs in circle
2024-10-15 17:40:00 -07:00
Awni Hannun
f9f8c167d4
fix submodule stubs ( #1492 )
2024-10-15 16:23:37 -07:00
Awni Hannun
3f86399922
Real and Imag ( #1490 )
...
* real and imag
* fix
* fix
2024-10-15 16:23:15 -07:00
LastWhisper
2b8ace6a03
Typing the dropout. ( #1479 )
2024-10-15 06:45:46 -07:00
Awni Hannun
0ab8e099e8
Fix cpu segfault ( #1488 )
...
* fix cpu segfault
* nit in tests
2024-10-14 16:17:03 -07:00
Awni Hannun
020f048cd0
A few updates for CPU ( #1482 )
...
* some updates
* format
* fix
* nit
2024-10-14 12:45:49 -07:00
Awni Hannun
881615b072
Faster metal compiled kernels + some fixes ( #1486 )
...
* bump mac tests to use py39
* work per thread for compiled kernels
* fixe for large arrays
* fix
2024-10-14 12:45:38 -07:00
Awni Hannun
0eef4febfd
bump mac tests to use py39 ( #1485 )
2024-10-14 10:40:32 -07:00
Awni Hannun
b54a70ec2d
Make push button linux distribution ( #1476 )
...
* try again
* try again
* try again
* try again
* try again
* try again
* try again
* try again
* .circleci/config.yml
* one more fix
* nit
2024-10-14 06:21:44 -07:00
Awni Hannun
bf6ec92216
Make the GPU device more thread safe ( #1478 )
...
* gpu stream safety
* comment
* fix
2024-10-12 17:49:15 -07:00
Awni Hannun
c21331d47f
version bump ( #1477 )
2024-10-10 13:05:17 -07:00
Awni Hannun
e1c9600da3
Add mx.random.permutation
( #1471 )
...
* random permutation
* comment
2024-10-08 19:42:19 -07:00
Awni Hannun
1fa0d20a30
consistently handle all -inf in softmax ( #1470 )
2024-10-08 09:54:02 -07:00
Awni Hannun
3274c6a087
Fix array is_available race cases ( #1468 )
2024-10-07 19:13:50 -07:00
Angelos Katharopoulos
9b12093739
Add the roll op ( #1455 )
2024-10-07 17:21:42 -07:00
Awni Hannun
f374b6ca4d
Bump nanobind to 2.2 ( #1461 )
...
* bump nanobind
* extension version for tests
2024-10-07 16:52:40 -07:00
Awni Hannun
0070e1db40
Fix deep recursion with siblings ( #1462 )
...
* fix recursion with siblings
* fix
* add test
* increase tol
2024-10-07 06:15:33 -07:00
Awni Hannun
95d04805b3
Fix complex power on Metal ( #1460 )
2024-10-06 19:58:30 -07:00
Awni Hannun
e4534dac17
Conv grad with groups + bugfix ( #1449 )
...
* fix bug in flipped conv with groups, start of grad for groups
* fix
* fix
* fix + test
2024-10-06 07:08:53 -07:00
Angelos Katharopoulos
fef3c4ec1d
Fix mpi test in CI ( #1456 )
...
* Fix mpi test in CI
* Set bind to none
2024-10-06 06:09:17 -07:00
Awni Hannun
1bdc038bf9
fix argpartition + faster {arg} sorts / partitions ( #1453 )
2024-10-03 14:21:25 -07:00
Awni Hannun
5523d9c426
faster cpu indexing ( #1450 )
2024-10-03 13:53:47 -07:00
Angelos Katharopoulos
d878015228
Fix normalization check_input ( #1452 )
2024-10-03 13:26:56 -07:00
Cheng
5900e3249f
Fix building on Linux ( #1446 )
2024-09-30 07:00:39 -07:00
Angelos Katharopoulos
bacced53d3
Fix row reduce with very few rows ( #1447 )
2024-09-29 20:00:35 -07:00
Lucas Newman
4a64d4bff1
Add support for grouped 1D convolutions to the nn API ( #1444 )
...
* Fix the weight shape for grouped convolutions from the nn API.
* Add tests.
* Pre-commit formatting.
* Add input validation.
* Use integer division instead of casting.
* docs
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-28 06:41:07 -07:00
Awni Hannun
b1e2b53c2d
bump ( #1445 )
2024-09-27 13:53:02 -07:00
Awni Hannun
11354d5bff
Avoid io timeout for large arrays ( #1442 )
2024-09-27 13:32:14 -07:00
Awni Hannun
718aea3f1d
allow take to work with integer index ( #1440 )
2024-09-26 15:58:03 -07:00
Awni Hannun
5b6f38df2b
Faster cpu ops ( #1434 )
...
* faster binary and cleaner copy
* use recursive template for other ops
* more cleanup
* fix from cleanup
* more clean
* fix binary
* use contiguous iterator
* add 3d
* nits
* fix
* fix?
* fix
* fix rebase
2024-09-26 09:19:13 -07:00
Awni Hannun
0b4a58699e
Some overhead reductions in mx.fast.metal_kernel ( #1437 )
...
* some overhead reductions
* fix
* use +=
* use more +=
2024-09-25 17:25:21 -07:00
Awni Hannun
4f9f9ebb6f
Faster Metal unary and binary for general case ( #1431 )
...
* faster unary and binary for general case
* update ternary + jit fix
* fix jit
* unary work per thread
2024-09-25 12:07:43 -07:00
Awni Hannun
afc9c0ec1b
dtype is copy assignable ( #1436 )
2024-09-25 12:07:13 -07:00
Awni Hannun
195b429d99
Put along axis + fixe for partition grad ( #1430 )
...
* put along axis, fixes for partition grad
* zeros for arg reduce
2024-09-23 10:03:38 -07:00
Luke Carlson
2b878e9dd7
Create CITATION.cff ( #1425 )
2024-09-20 11:39:46 -07:00
Awni Hannun
67b6bf530d
Optimization for general ND copies ( #1421 )
2024-09-17 17:59:51 -07:00
Nripesh Niketan
6af5ca35b2
feat: add cross_product ( #1252 )
...
* feat: add cross_product
* lint
* python binding
* refactor: Improve error message for cross_product function
* refactor: more close to numpy cross product
* refactor: improve error message for cross_product function
* finish
* fix acks
* allow old numpy
* doc
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-17 13:12:43 -07:00
Awni Hannun
4f46e9c997
More fixes for arrays with large sizes ( #1405 )
...
* compile works for big arrays when contiguous
* style
* nits in docs
* a bunch more stuff
* update jit
* update jit
* use constant for shapes and strides and remove elem_to_loc overload
* use kernel instantiation
* docs nits
* update binary and ternary
* comments
2024-09-17 12:46:31 -07:00
Awni Hannun
c6739ba7f3
Faster RNN layers ( #1419 )
...
* faster rnn
* use admm
2024-09-17 06:04:19 -07:00
Angelos Katharopoulos
914409fef9
Data parallel helper ( #1407 )
2024-09-16 18:17:21 -07:00
jjuang-apple
8d68a3e805
remove fmt dependencies from MLX install ( #1417 )
2024-09-16 13:32:28 -07:00
jjuang-apple
6bbcc453ef
avoid using find_library to make install truly portable ( #1416 )
2024-09-16 13:21:32 -07:00
Awni Hannun
d5ed4d7a71
override class function ( #1418 )
2024-09-16 13:21:04 -07:00
Nripesh Niketan
669c27140d
Chore: add pre-commit hook for cmake ( #1362 )
...
* reset and lint
* format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-16 12:53:01 -07:00
Max-Heinrich Laves
adcc88e208
Conv cpu improvements ( #1410 )
2024-09-15 18:45:10 -07:00
Awni Hannun
d6492b0163
fix clip ( #1415 )
2024-09-14 16:09:09 -07:00