Angelos Katharopoulos
|
3336a35512
|
Fix the segments type in the test
|
2025-07-07 17:25:19 -07:00 |
|
Angelos Katharopoulos
|
8ea5729ee4
|
CI weirdness due to large arrays
|
2025-07-07 00:18:42 -07:00 |
|
Angelos Katharopoulos
|
86dc1a2683
|
Add cuda skip layer
|
2025-07-05 12:43:36 -07:00 |
|
Angelos Katharopoulos
|
9e5bb5295a
|
Add more tests and fix qmm gradient
|
2025-07-05 03:02:49 -07:00 |
|
Angelos Katharopoulos
|
b28577289e
|
Disable the test for CUDA
|
2025-07-04 19:17:45 -07:00 |
|
Angelos Katharopoulos
|
2d0f452aae
|
Fix the test and cpu edge case
|
2025-07-04 18:36:20 -07:00 |
|
Angelos Katharopoulos
|
d96a33c776
|
Add rudimentary test for gather_mm with sorted indices
|
2025-07-03 14:02:33 -07:00 |
|
Angelos Katharopoulos
|
4babc035a3
|
Add a test for segmented_mm
|
2025-07-03 13:49:46 -07:00 |
|
Angelos Katharopoulos
|
6020ad6363
|
Start the segmented_mm op and CPU primitive
|
2025-07-02 15:45:09 -07:00 |
|
Awni Hannun
|
cfb6a244ea
|
allow parameters to be deleted (#2325)
|
2025-07-01 21:27:23 -07:00 |
|
Awni Hannun
|
dd4f53db63
|
use fp32 for testing, add more complex ops (#2322)
|
2025-07-01 07:30:00 -07:00 |
|
Awni Hannun
|
33bf1a244b
|
Fix module update in strict mode (#2321)
* fix module update in strict mode
* allow GELU to be pickled
|
2025-06-29 11:12:29 -07:00 |
|
Angelos Katharopoulos
|
772f471ff2
|
[CUDA] Fix reductions (#2314)
|
2025-06-27 12:59:20 -07:00 |
|
Angelos Katharopoulos
|
2c11d10f8d
|
Split broadcast so it is always fused in compile (#2318)
|
2025-06-26 22:08:18 -07:00 |
|
Awni Hannun
|
81bb9a2a9e
|
Compile float64 functions on CPU (#2311)
|
2025-06-24 10:18:52 -07:00 |
|
Angelos Katharopoulos
|
5adf185f86
|
Fix update_modules() when providing a subset (#2308)
|
2025-06-20 17:19:46 -07:00 |
|
Awni Hannun
|
76831ed83d
|
Build CUDA release in Circle (#2306)
* cuda release
* add license
|
2025-06-19 15:26:36 -07:00 |
|
Awni Hannun
|
cad5c0241c
|
[CUDA] synch properly waits for all tasks to finish and clear (#2303)
* cuda synch properly waits for all tasks to finish and clear
* fix copy
|
2025-06-17 12:03:25 -07:00 |
|
Awni Hannun
|
b8022c578a
|
divmod, partition, sort fixes (#2302)
|
2025-06-16 18:49:32 -07:00 |
|
Awni Hannun
|
bc53f8293f
|
Cuda bug fixes 2 (#2298)
* more bug fixes
* more bug fixes
* format
|
2025-06-16 13:14:46 -07:00 |
|
Awni Hannun
|
c552ff2451
|
[CUDA] Fix back-end bugs and enable corresponding tests (#2296)
* Fix some cuda back-end bugs and enable corresponding tests
* more fixes
* enable more tests
* format
|
2025-06-16 08:45:40 -07:00 |
|
Awni Hannun
|
4fda5fbdf9
|
add python testing for cuda with ability to skip list of tests (#2295)
|
2025-06-15 10:56:48 -07:00 |
|
Awni Hannun
|
8402a2acf4
|
Fix complex power and print (#2286)
* fix complex power and print
* fix complex matmul shape
|
2025-06-13 11:13:00 -07:00 |
|
Awni Hannun
|
c35f4d089a
|
start cuda circle config (#2256)
* rebase
* fix metal kernel linking issue on cuda
* start cuda circle config
|
2025-06-10 21:19:47 -07:00 |
|
Angelos Katharopoulos
|
8590c0941e
|
Add load_safe to the general conv loaders (#2258)
|
2025-06-10 20:58:16 -07:00 |
|
Awni Hannun
|
62fecf3e13
|
fix conv export (#2265)
|
2025-06-10 09:34:01 -07:00 |
|
Christopher Fleetwood
|
004c1d8ef2
|
Report number of missing parameters (#2264)
* chore: inform
* chore: format
---------
Co-authored-by: FL33TW00D <FL33TW00D@users.noreply.github.com>
|
2025-06-10 06:37:50 -07:00 |
|
Awni Hannun
|
9ce77798b1
|
fix export to work with gather/scatter axis (#2263)
|
2025-06-09 20:37:27 -07:00 |
|
Emmanuel Ferdman
|
5866b3857b
|
Refactor the lu test (#2250)
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
|
2025-06-07 06:12:08 -07:00 |
|
Awni Hannun
|
1ca616844b
|
Fix unintuitive metal kernel caching (#2242)
* Fix unintuitive metal kernel caching
* alternative solution
|
2025-06-06 20:08:15 -07:00 |
|
Awni Hannun
|
a5ac9244c4
|
fix linux linking error (#2248)
|
2025-06-06 10:41:51 -07:00 |
|
Awni Hannun
|
c763fe1be0
|
default strict mode for module update and update_modules (#2239)
|
2025-06-05 15:27:02 -07:00 |
|
Suryash Malviya
|
0408ba0a76
|
Optimizing Complex Matrix Multiplication using Karatsuba’s Algorithm (#2220)
* Implementing Complex Matmul using Karatsuba Algorithm
* Implemented Karatsuba's Algorithm for complex matmul and pre-commit them
* fix
---------
Co-authored-by: Awni Hannun <awni@apple.com>
|
2025-06-02 15:58:46 -07:00 |
|
Awni Hannun
|
6ef2f67e7f
|
5bit quants (#2226)
* 5bit quants
* 5bit quants
|
2025-05-30 12:12:10 -07:00 |
|
Angelos Katharopoulos
|
0359bf02c9
|
Nearest upsample (#2202)
|
2025-05-19 11:23:38 -07:00 |
|
Awni Hannun
|
8576e6fe36
|
fix conv2d bug + faster conv 1d (#2195)
* fix conv2d bug + faster conv 1d
* revert sort + flaky test
|
2025-05-18 06:05:11 -07:00 |
|
Angelos Katharopoulos
|
0654543dcc
|
Add complex eigh (#2191)
|
2025-05-18 00:18:43 -07:00 |
|
Awni Hannun
|
602f43e3d1
|
fix conv grad (#2187)
|
2025-05-15 19:20:36 -07:00 |
|
Awni Hannun
|
a2cadb8218
|
real and imag properties (#2189)
|
2025-05-15 18:17:50 -07:00 |
|
Awni Hannun
|
c1eb9d05d9
|
non-symmetric eig and eigh (#2188)
|
2025-05-15 13:01:44 -07:00 |
|
Angelos Katharopoulos
|
cf6c939e86
|
Fix some complex vjps (#2178)
|
2025-05-14 23:37:12 -07:00 |
|
Angelos Katharopoulos
|
130df35e1b
|
Add random normal distribution for complex numbers (#2182)
|
2025-05-13 22:43:45 -07:00 |
|
Angelos Katharopoulos
|
3aa9cf3f9e
|
Fix put_along_axis for empty arrays (#2181)
|
2025-05-13 14:27:53 -07:00 |
|
Awni Hannun
|
8f3d208dce
|
Close a couple edge case bugs: hadamard and addmm on empty inputs (#2177)
* handle hadamard and addmm on empty inputs
* fix
|
2025-05-12 10:48:57 -07:00 |
|
Ivan Fioravanti
|
caaa3f1f8c
|
Small typos in mx.metal deprecations (#2176)
|
2025-05-11 06:03:47 -07:00 |
|
ATurker
|
a7fae8a176
|
fix: conv_general differences between gpu, cpu (#2070)
* fix general_conv padding
* fix bugs
* add test
---------
Co-authored-by: Awni Hannun <awni@apple.com>
|
2025-05-09 10:26:52 -07:00 |
|
Awni Hannun
|
af705590ac
|
fix batched vector sdpa (#2152)
|
2025-05-05 13:13:03 -07:00 |
|
Angelos Katharopoulos
|
481349495b
|
GPU Hadamard for large N (#1879)
|
2025-05-01 17:19:17 -07:00 |
|
Awni Hannun
|
9daa6b003f
|
fix shapeless export (#2148)
|
2025-05-01 15:02:02 -07:00 |
|
Angelos Katharopoulos
|
a3a632d567
|
Fix the launcher when ran locally (#2147)
|
2025-05-01 12:56:09 -07:00 |
|