Cheng
79071bfba4
Fix out-of-bounds default value in logsumexp/softmax ( #2213 )
2025-05-21 07:25:16 -07:00
Angelos Katharopoulos
cf6c939e86
Fix some complex vjps ( #2178 )
2025-05-14 23:37:12 -07:00
Cheng
0cae0bdac8
CUDA backend: backbone ( #2075 )
2025-05-06 21:26:46 -07:00
Awni Hannun
9c5e7da507
fix compile merging ( #2150 )
2025-05-02 15:08:50 -07:00
Cheng
ea890d8710
Remove metal-only tests ( #2139 )
2025-04-30 09:08:39 -07:00
Aashiq Dheeraj
bb6565ef14
add fftshift and ifftshift fft helpers ( #2135 )
...
* add fftshift and ifftshift fft helpers
* address comments
* axes have to be iterable
* fix fp error in roll + add test
---------
Co-authored-by: Aashiq Dheeraj <aashiq@aashiq-mbp-m4.local>
2025-04-29 22:13:45 -07:00
Param Thakkar
600e87e03c
Added output_padding parameters in conv_transpose ( #2092 )
2025-04-23 09:26:33 -07:00
Awni Hannun
dc4eada7f0
Use unordered map for kwargs in export/import ( #2087 )
...
* use unordered map for kwargs in export/import
* comment
2025-04-21 07:17:22 -07:00
Param Thakkar
5f04c0f818
Fixed shift operations issue ( #2080 )
...
* Fixed shift operations issue
* Added tests and fixes
* Fixed loop syntax error
* Added tests for bool
* Fixed typo
2025-04-18 14:28:33 -07:00
Cheng
ba09f01ce8
Remove test of converting negative float to uint ( #2048 )
2025-04-06 06:21:46 -07:00
Jesper Stemann Andersen
5f5770e3a2
Fix CPU sign for unsigned ints ( #2024 )
...
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2025-03-30 17:56:59 -07:00
Awni Hannun
5580b47291
iinfo and scalar overflow detection ( #2009 )
2025-03-27 19:54:56 -07:00
Awni Hannun
a6b5d6e759
revise cmake minimum for doctest ( #2014 )
2025-03-27 19:30:58 -07:00
Awni Hannun
4e1994e9d7
move memory APIs into top level mlx.core ( #1982 )
2025-03-21 07:25:12 -07:00
Awni Hannun
c4230747a1
redesign for faster cpu/gpu synch ( #1869 )
...
* redesign for faster cpu/gpu synch
* load + more async CPU
* use command encoder API and move more ops to use it
* make fence back-end generic + CPU only fence
* faster build
* fix async eval
* fixes + handle temporaries
* fix / improve cpu conv
* remove unused status, fix siblings
* fix extensions
* fix
* fix no cpu build
* format
* comments
* fix perf regression, remove unecessary abort
* fix events, task limit cpu
* fix waiting
* fix donation / temporaries in normalization
2025-03-06 19:23:38 -08:00
Abe Leininger
3835a428c5
Adds nuclear norm support ( #1894 )
...
* adjust norm unit test tolerance
2025-03-04 13:26:02 -08:00
Abe Leininger
a5ededf1c3
CPU LU factorization and linear solvers ( #1451 )
...
* linalg solve backend
* nits
* more nits + fix
* luf primitive and lu, solve, and solve_triangular backends
* changes / nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2025-02-10 12:32:24 -08:00
Jesper Stemann Andersen
f6c0499b8d
Resolved ambiguity in mlx::core::take_along_axis ( #1822 )
...
* Resolved ambiguity in mlx::core::take_along_axis
Detected by GCC 10 on riscv64-linux-gnu.
* Formatted
* Removed superfluous parentheses in random_tests.cpp
2025-02-04 06:06:17 -08:00
Jesper Stemann Andersen
2d8e667400
MinGW support ( #1806 )
...
* Changed /bin/bash to bash for generating compiling preamble
* Fix wrt jit_compiler mingw like msvc wrt. WEXITSTATUS
* Solved ambiguity wrt. bernoulli test shape
* Disabled distributed/ring on Windows
* Fixed jit_compiler command wrt. MinGW
* Extended jit_compiler patch wrt. WEXITSTATUS to FreeBSD
2025-02-01 12:40:06 -08:00
Awni Hannun
2235dee906
catch stream errors earlier to avoid aborts ( #1801 )
2025-01-27 14:05:43 -08:00
Awni Hannun
da8c885784
Simplify removes no-ops from the tape ( #1759 )
...
* simplify removes no-ops from the tape
* comment
2025-01-09 11:23:19 -08:00
Awni Hannun
516ded618b
Dynamic slicing ( #1741 )
...
* dynamic slice and slice update
* python bindings + tests + fix set item
* fix compile issue
* comment
* fix jit
2025-01-07 14:02:16 -08:00
Awni Hannun
ae69cb15e9
shapeless compile in docs and partially shapeless reshape ( #1742 )
2025-01-02 16:24:42 -08:00
Cheng
8ecdfb718b
Fix export.cpp compilation with MSVC ( #1737 )
2024-12-29 06:56:30 -08:00
Awni Hannun
4ba0c24a8f
Export / import functions to / from a file ( #1642 )
...
* export and import functions
* refactor + works for few primitives
* nit
* allow primitives with state
* nit
* nit
* simplify serialize / deserialize
* fix for constants
* python bindings
* maybe fix serialize failure case
* add example
* more primitives, training kind of works
* same result for python and c++
* some fixes
* fix export
* template it up
* some simplificatoin
* rebase
* allow kwargs and multiple functions
* exporter
* more primitives for exporting
* deal with endianness
* handle invalid stream
* add docstring
2024-12-24 11:19:13 -08:00
Awni Hannun
c3628eea49
Add mx.finfo
and use it when making causal mask ( #1726 )
...
* finfo
* fixes
* docs
2024-12-19 14:52:41 -08:00
Awni Hannun
e03f0372b1
More shape type ( #1705 )
...
* more shape type
* fix
2024-12-19 08:08:20 -08:00
Awni Hannun
4e1e9520e1
Flatten and unflatten ( #1692 )
...
* flatten and unflatten
* fix grad
* fix shape infer
* use squeeze + unsqueeze in get_item
2024-12-11 21:51:37 -08:00
Awni Hannun
f3dfa36a3a
Fix x86 tests ( #1691 )
...
* fix x86 tests
* comment
2024-12-11 07:47:18 -08:00
Awni Hannun
f76a49e555
ExpandDims
primitive (#1687 )
...
* add squeeze primitive
* simplify squeeze, use in gather
* fix
* fix
* fix
* fix
* fix no cpu
* use squeeze in matmul and friends
* expand dims primitive
* comment
2024-12-10 16:39:07 -08:00
Awni Hannun
40c62c1321
Use int64 stride everywhere ( #1671 )
...
* use int64 stride everywhere
* fix ext
* fix ext
* more shape + cleanup
* one more
* few more
2024-12-09 11:09:02 -08:00
Cheng
d0f471cff7
Using math defines requires switch in MSVC ( #1665 )
...
* Using math defines requires switch in MSVC
* Fix more math macros
* Fix type
* Remove _MSC_VER guard for math defines
2024-12-08 08:16:28 -08:00
Cheng
6f316b8bf5
Use int64_t instead of ssize_t ( #1673 )
2024-12-07 20:10:44 -08:00
Cheng
7c10c93a1f
Convert filesystem path to std::string explicitly ( #1672 )
2024-12-07 20:10:06 -08:00
Awni Hannun
69a2991614
allow compiling lambdas in C++ ( #1650 )
...
* allow compiling lambdas in C++
* fix test
* more tests
* auto detect capture-less lambda
2024-12-06 13:13:21 -08:00
Nripesh Niketan
3bb5b4a302
Chore: Add default language in pre-commit and bump hooks ( #1652 )
2024-12-06 07:54:29 -08:00
Awni Hannun
e047fd977d
compile changes if stream changes ( #1644 )
2024-12-03 14:37:44 -08:00
Awni Hannun
dcca0d7477
contiguous op / prim ( #1612 )
2024-11-21 19:51:49 -08:00
Cocoa
0d5e7716ad
fix typo: accross -> across ( #1609 )
...
Signed-off-by: Cocoa <i@uwucocoa.moe>
2024-11-20 15:30:51 -08:00
Alex Barron
048fabdabd
Fix vmap constant output size ( #1524 )
...
* use inputs to determine output size
* remove noop vmap tests
2024-10-30 16:16:53 -07:00
Kashif Rasul
3ddc07e936
Eigenvalues and eigenvectors ( #1334 )
...
* initial eigvalsh
* add compute_vectors
* add compute_vectors_
* return a pair
* add eigh to return only eigenvectors
* fixed typo
* merge merge Eighvalsh and Eigh into a single primitive
* use the same primate with the flag
* fix primatives
* use MULTI
* fix eval_gpu
* fix decleration
* rename EighPrimitive to Eigh
* tests
* tests
* fix rebase and format
* cleanup lapack
* format
* add cblas.h
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-10-22 12:18:48 -07:00
Angelos Katharopoulos
9b12093739
Add the roll op ( #1455 )
2024-10-07 17:21:42 -07:00
Awni Hannun
95d04805b3
Fix complex power on Metal ( #1460 )
2024-10-06 19:58:30 -07:00
Awni Hannun
195b429d99
Put along axis + fixe for partition grad ( #1430 )
...
* put along axis, fixes for partition grad
* zeros for arg reduce
2024-09-23 10:03:38 -07:00
Nripesh Niketan
6af5ca35b2
feat: add cross_product ( #1252 )
...
* feat: add cross_product
* lint
* python binding
* refactor: Improve error message for cross_product function
* refactor: more close to numpy cross product
* refactor: improve error message for cross_product function
* finish
* fix acks
* allow old numpy
* doc
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-17 13:12:43 -07:00
Nripesh Niketan
669c27140d
Chore: add pre-commit hook for cmake ( #1362 )
...
* reset and lint
* format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-16 12:53:01 -07:00
Awni Hannun
e7e59c6f05
Fix copying scalars by adding fill_gpu ( #1402 )
...
* fix copying scalars by adding fill_gpu
* Another copy scalar changed to fill
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-09-09 15:54:08 -07:00
Awni Hannun
7cca1727af
Fix slice data size ( #1394 )
...
* fix slice data size and add tests
* fix contiguous flag
* simplify stride and perform copy for non-contiguous arrays
* fix cpu
* comment
2024-09-04 19:10:43 -07:00
Jeethu Rao
bd47e1f066
Fix neon_fast_exp and add more softmax tests ( #1367 )
2024-08-27 23:42:42 -07:00
Aditya Dhulipala
e6b223df5f
Pinv ( #875 )
2024-08-27 23:06:12 -07:00