Commit Graph

389 Commits

Author SHA1 Message Date
Nripesh Niketan
5cc5201914
feat: Add orthogonal initializer and corresponding tests (#1651)
* feat: Add orthogonal initializer and corresponding tests

* lint

* Add acknowledgements

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-13 07:29:20 -08:00
Awni Hannun
657f466402
use sdpa and exportable functions in transformer multi head attention (#1760) 2025-01-09 13:11:55 -08:00
Alex Barron
c7b0300af5
Fix batched qmv bug (#1758) 2025-01-09 11:45:57 -08:00
Awni Hannun
1ccaf80575
Dynamic broadcasting for shapeless compile/export (#1722)
* working towards dynamic broadcast

* shapeless broadcast

* fix build + nits

* use broadcast arrays in quantize matmul

* some cleanup / consistency

* mend

* some comments

* add vjp, jvp for broadcast axes
2025-01-09 11:04:24 -08:00
Awni Hannun
d1766f2c70
Add boolean mask support in vector SDPA (#1757) 2025-01-07 20:24:53 -08:00
Awni Hannun
516ded618b
Dynamic slicing (#1741)
* dynamic slice and slice update

* python bindings + tests + fix set item

* fix compile issue

* comment

* fix jit
2025-01-07 14:02:16 -08:00
Awni Hannun
d5ec172c95
Allow boolean mask in sdpa (#1753)
* allow boolean mask in sdpa

* more permissive donation in ternary
2025-01-06 16:57:07 -08:00
Awni Hannun
058d6ce683
mpi send use input as output (#1750)
* mpi send use input as output

* move earlier
2025-01-06 06:08:43 -08:00
Awni Hannun
259025100e
Fix nd ternary on GPU (#1746) 2025-01-03 11:52:17 -08:00
Awni Hannun
6fa0501387
Fix concatenate/slice_update vjp + reduce binary size (#1735)
* fix concatenate vjp + reduce binary size

* also cast in slice update
2025-01-02 16:36:33 -08:00
Awni Hannun
ae69cb15e9
shapeless compile in docs and partially shapeless reshape (#1742) 2025-01-02 16:24:42 -08:00
Venkata Naga Aditya Datta Chivukula
491fa95b1f
Added Kronecker Product (#1728) 2025-01-02 16:00:34 -08:00
Awni Hannun
4ba0c24a8f
Export / import functions to / from a file (#1642)
* export and import functions

* refactor + works for few primitives

* nit

* allow primitives with state

* nit

* nit

* simplify serialize / deserialize

* fix for constants

* python bindings

* maybe fix serialize failure case

* add example

* more primitives, training kind of works

* same result for python and c++

* some fixes

* fix export

* template it up

* some simplificatoin

* rebase

* allow kwargs and multiple functions

* exporter

* more primitives for exporting

* deal with endianness

* handle invalid stream

* add docstring
2024-12-24 11:19:13 -08:00
Awni Hannun
ebfe64b92d
shapeless slice update and broadcast when possible (#1727) 2024-12-23 11:25:15 -08:00
Awni Hannun
0308e9af71
Allow offset to be an mx.array for mx.fast.rope (#1724)
* allow offset for rope

* comment
2024-12-19 15:51:44 -08:00
Awni Hannun
c3628eea49
Add mx.finfo and use it when making causal mask (#1726)
* finfo

* fixes

* docs
2024-12-19 14:52:41 -08:00
Awni Hannun
d03c01dfbc
fix unflatten vjp (#1708) 2024-12-16 18:37:57 -08:00
Cheng
af5a614aad
Eval before cleanup so model file is unlocked (#1702) 2024-12-14 21:41:49 -08:00
Cheng
dfccd17ab9
Use psutil to get memory info on Windows (#1700) 2024-12-13 19:50:13 -08:00
Awni Hannun
9111999af3
Fix small sort with metal validation (#1695) 2024-12-12 09:21:45 -08:00
Awni Hannun
6bd28d246e
Allow no copy negative strides in as_strided and slice (#1688)
* allow no copy negative strides in as_strided and slice

* fix jit

* fix jit
2024-12-12 08:59:45 -08:00
Awni Hannun
4e1e9520e1
Flatten and unflatten (#1692)
* flatten and unflatten

* fix grad

* fix shape infer

* use squeeze + unsqueeze in get_item
2024-12-11 21:51:37 -08:00
Awni Hannun
f76a49e555
ExpandDims primitive (#1687)
* add squeeze primitive

* simplify squeeze, use in gather

* fix

* fix

* fix

* fix

* fix no cpu

* use squeeze in matmul and friends

* expand dims primitive

* comment
2024-12-10 16:39:07 -08:00
Awni Hannun
29a620cab2
No reshapes in quantized embedding (#1682)
* no reshapes in quantized embedding

* fix inadvertant cast

* add tol
2024-12-09 18:57:38 -08:00
Awni Hannun
35b412c099
Fix compile hasher for string constants. (#1677)
* fix hash

* add test

* nit
2024-12-09 09:26:18 -08:00
mt_caret
fd3377dd1f
Support bias correction in Adam and AdamW optimizers (#1640) 2024-12-06 12:13:34 -08:00
Awni Hannun
bc2a29f033
fix (#1654) 2024-12-06 10:48:58 -08:00
Alex Barron
c79f6a4a8c
3 and 6 bit quantization (#1613)
* Support 3 and 6 bit quantization
2024-11-22 10:22:13 -08:00
Awni Hannun
0c5eea226b
Reduce specializations (#1607)
* start of reduce specializations

* fix all reduce

* fix many dims

* fix

* non-jit tests clear

* cleanup instantiations

* cpu merges

* change dim specializations

* optimize

* fix jit

* fix jit

* use higher precision for integer sum+prod

* fixes
2024-11-21 19:53:00 -08:00
Angelos Katharopoulos
d8c824c594
Formatting fixes (#1606) 2024-11-20 15:30:36 -08:00
Saanidhya
cb431dfc9f
Adds 3D pooling (#1526) 2024-11-19 16:45:24 -08:00
Awni Hannun
61d787726a
Fix view scalar bug segfault (#1603)
* fix view scalar bug

* fix view scalar bug

* one more fix
2024-11-19 10:54:05 -08:00
Angelos Katharopoulos
5e89aace9b
Fix concatenate vmap (#1600) 2024-11-19 10:44:04 -08:00
Awni Hannun
bf481e8e5d
Fix sibling leak (#1590)
* add test

* fix + test

* fix fix
2024-11-18 19:17:01 -08:00
Awni Hannun
9bd03dd9b4
More buffer donation with no-ops (#1591)
* more donation

* fix test

* fix build
2024-11-18 08:35:41 -08:00
Awni Hannun
8c34c9dac4
throw for invalid case and remove test (#1575) 2024-11-08 12:04:03 -08:00
Awni Hannun
91c0277356
fix per-example mask + docs in sdpa (#1574) 2024-11-08 11:51:15 -08:00
Awni Hannun
59247c2b62
add groups in conv2d (#1569) 2024-11-07 13:57:53 -08:00
Awni Hannun
54f05e7195
Fix gather vmap (#1563)
* fix gather

* fix
2024-11-05 11:29:20 -08:00
Alex Barron
26be608470
Add split_k qvm for long context (#1564)
* Add splitk qvm

* configurable splitk

* tuning

* remove extra instantiation

* remove refactor

* separate test

* cpu tolerance
2024-11-05 11:25:19 -08:00
Awni Hannun
76f275b4df
error in rms for wrong size (#1562) 2024-11-04 13:24:02 -08:00
Angelos Katharopoulos
62f297b51d
Sdpa fix (#1558) 2024-11-02 21:25:46 -07:00
Awni Hannun
57c6aa7188
fix multi output leak (#1548) 2024-10-31 09:32:01 -07:00
Awni Hannun
4f72c66911
improvements to scatter / gather (#1541) 2024-10-30 19:30:54 -07:00
Alex Barron
048fabdabd
Fix vmap constant output size (#1524)
* use inputs to determine output size

* remove noop vmap tests
2024-10-30 16:16:53 -07:00
Awni Hannun
d2ff04a4f2
fix format (#1539) 2024-10-28 18:29:14 -07:00
Awni Hannun
0eb56d5be0
Wired (#1510)
* expose residency sets as wire/unwire

* returns wired size

* fix

* runtime support check

* fix os check

* fix test

* fix no metal build

* docs

* nit

* nits in docs

* nits
2024-10-25 09:35:33 -07:00
Venkata Naga Aditya Datta Chivukula
430ffef58a
[Feature] Added Sparse Initialization (#1498)
Co-authored-by: Saanidhyavats <saanidhyavats@gmail.com>
2024-10-24 12:31:24 -07:00
Alex Barron
3d17077187
Add mx.array.__format__ (#1521)
* add __format__

* actually test something

* fix
2024-10-24 11:11:39 -07:00
Angelos Katharopoulos
c9b41d460f
Working 64-bit scans (#1506) 2024-10-24 11:05:46 -07:00