Commit Graph

807 Commits

Author SHA1 Message Date
Awni Hannun
c3628eea49 Add mx.finfo and use it when making causal mask (#1726)
* finfo

* fixes

* docs
2024-12-19 14:52:41 -08:00
Awni Hannun
e03f0372b1 More shape type (#1705)
* more shape type

* fix
2024-12-19 08:08:20 -08:00
Alex Barron
f17536af9c More lenient mask type check in SDPA (#1723)
* check mask type

* require promotion
2024-12-18 19:41:38 -08:00
Awni Hannun
7480059306 track resource limit and throw if exceeded (#1718) 2024-12-18 18:45:58 -08:00
Awni Hannun
8bae22b0fa fix deletion of non-evaled arrays with siblings (#1714) 2024-12-18 18:45:36 -08:00
Alex Barron
49c34c4161 check mask type (#1721) 2024-12-18 14:25:18 -08:00
Awni Hannun
5548fcc96d fix synch race (#1719) 2024-12-18 12:25:16 -08:00
Cheng
070bd433ab Shorter kernel name for Windows (#1701)
* Shorter kernel name for Windows

* Only hash the clipped part
2024-12-17 18:51:38 -08:00
Cheng
c8fb54951a Define NOMINMAX before windows.h (#1715) 2024-12-17 18:51:24 -08:00
Awni Hannun
d03c01dfbc fix unflatten vjp (#1708) 2024-12-16 18:37:57 -08:00
Jesper Stemann Andersen
a82996e9fb io/load: Enabled pread implementation for mingw32 (#1706) 2024-12-16 07:20:45 -08:00
Cheng
4768c61b57 Make sure gguf_ctx is closed when error happens (#1699) 2024-12-13 19:50:19 -08:00
Cheng
635117c5d4 Read/write files in binary mode (#1698) 2024-12-13 17:37:05 -08:00
Awni Hannun
50f3535693 Use expand_dims / unflatten / etc in more places (#1696)
* use expand_dims / unflatten in a couple more places

* few more

* few more

* fix
2024-12-12 17:00:44 -08:00
Awni Hannun
9111999af3 Fix small sort with metal validation (#1695) 2024-12-12 09:21:45 -08:00
Awni Hannun
6bd28d246e Allow no copy negative strides in as_strided and slice (#1688)
* allow no copy negative strides in as_strided and slice

* fix jit

* fix jit
2024-12-12 08:59:45 -08:00
Cheng
4d595a2a39 Make compiled preamble work in MSVC (#1675)
* Make compiled preamble work in MSVC

* Remove logging

* Only use powershell for MSVC
2024-12-12 08:55:49 -08:00
Awni Hannun
4e1e9520e1 Flatten and unflatten (#1692)
* flatten and unflatten

* fix grad

* fix shape infer

* use squeeze + unsqueeze in get_item
2024-12-11 21:51:37 -08:00
Awni Hannun
f3dfa36a3a Fix x86 tests (#1691)
* fix x86 tests

* comment
2024-12-11 07:47:18 -08:00
Awni Hannun
f76a49e555 ExpandDims primitive (#1687)
* add squeeze primitive

* simplify squeeze, use in gather

* fix

* fix

* fix

* fix

* fix no cpu

* use squeeze in matmul and friends

* expand dims primitive

* comment
2024-12-10 16:39:07 -08:00
Cheng
56db268f47 Provide a pread implementation for MSVC (#1666) 2024-12-10 15:55:53 -08:00
Cheng
92ab6bdeb8 Fix shared library not exporting symbols on Windows (#1684)
* Fix shared library not exporting symbols on Windows

* Function name style
2024-12-10 13:59:14 -08:00
Cheng
0070e360a1 Disable MSVC warnings (#1680) 2024-12-09 19:41:14 -08:00
Awni Hannun
40c62c1321 Use int64 stride everywhere (#1671)
* use int64 stride everywhere

* fix ext

* fix ext

* more shape + cleanup

* one more

* few more
2024-12-09 11:09:02 -08:00
Cheng
d0f471cff7 Using math defines requires switch in MSVC (#1665)
* Using math defines requires switch in MSVC

* Fix more math macros

* Fix type

* Remove _MSC_VER guard for math defines
2024-12-08 08:16:28 -08:00
Cheng
6f316b8bf5 Use int64_t instead of ssize_t (#1673) 2024-12-07 20:10:44 -08:00
Cheng
d92ea094f1 Use && instead of and (#1663)
* Use && instead of and

* Remove "and" in ops.cpp
2024-12-07 18:26:39 -08:00
Cheng
6ae5423b4a Do not pass integers to isnan (#1664) 2024-12-07 18:26:23 -08:00
Cheng
9635cffdc8 Include io.h in MSVC for IO functions (#1661) 2024-12-07 18:26:06 -08:00
Cheng
96986fb362 Use auto* for pointers (#1662) 2024-12-07 18:25:40 -08:00
Cheng
3ceb341a75 Use correct complex type for MSVC (#1660) 2024-12-07 18:25:22 -08:00
Awni Hannun
69a2991614 allow compiling lambdas in C++ (#1650)
* allow compiling lambdas in C++

* fix test

* more tests

* auto detect capture-less lambda
2024-12-06 13:13:21 -08:00
Awni Hannun
d0b6cb0425 More primitives for compiling with shapeless (#1653)
* more shapeless and more Shape

* more shape

* fix

* fix
2024-12-06 11:29:18 -08:00
Alex Barron
95c4a2e3af add back conditionaltype (#1655) 2024-12-06 11:12:01 -08:00
Awni Hannun
fc88fd9097 Shape and Strides 1 / N (#1645)
* shape and stride type def

* more shape
2024-12-05 12:53:43 -08:00
Awni Hannun
c5b0928c1f fix fallback (#1646) 2024-12-05 11:59:53 -08:00
Awni Hannun
e047fd977d compile changes if stream changes (#1644) 2024-12-03 14:37:44 -08:00
Jagrit Digani
9d40e521d7 Stop matrix copies with new attention kernel (#1639) 2024-12-02 14:12:38 -08:00
Jesper Stemann Andersen
e4eeb4e910 Added missing unordered_map includes (#1635)
* Added missing includes in mlx/io.h and mlx/backend/metal/metal.h

* Added additional missing unordered_map includes that fixes build on FreeBSD
2024-12-02 07:03:03 -08:00
Ikko Eltociear Ashimine
9bc2183a31 docs: update device.cpp (#1632)
unecessary -> unnecessary
2024-11-27 20:58:26 -08:00
Awni Hannun
d4b222b6d3 Fix some leaks and races (#1629)
* fix leak and fix potential race

* more leak fixes

* fix one more
2024-11-27 20:01:20 -08:00
Jesper Stemann Andersen
af2af818a6 Enables build for *-linux-musl (#1627)
Also contributes to being able to build for *-w64-mingw32.

Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761
2024-11-27 13:14:24 -08:00
Awni Hannun
211411faf2 fix large ops (#1620) 2024-11-24 09:17:10 -08:00
Alex Barron
6f7986d592 Cleaner qmv/qvm (#1616) 2024-11-22 11:14:08 -08:00
Jagrit Digani
02bec0bb6d Matrix Attention kernel (#1610)
* Rough INIT

* [WIP]: Loading and Matmuls added

* [WIP]: Reductions and min working aligned kernel at headdim = 64

* [WIP] Added headdim 80 for testing

* [WIP] Update dispatch params for testing

* [WIP] Add support for unaligned seq lengths - still looks messy

* Update sdpa_benchmarks

* Update sdpa_benchmarks

* Update sdpa_benchmarks

* Enable gqa support

* Update benchmark and switch off 128 headdim

* Update headdim 128 tuning

* Remove older fast attention code. Write out O strided

* Disable hd=128 until further optimizations

* Enable bf16

* Fix data size bug

* Enable attn build outside of jit
2024-11-22 10:34:05 -08:00
Alex Barron
c79f6a4a8c 3 and 6 bit quantization (#1613)
* Support 3 and 6 bit quantization
2024-11-22 10:22:13 -08:00
Awni Hannun
0c5eea226b Reduce specializations (#1607)
* start of reduce specializations

* fix all reduce

* fix many dims

* fix

* non-jit tests clear

* cleanup instantiations

* cpu merges

* change dim specializations

* optimize

* fix jit

* fix jit

* use higher precision for integer sum+prod

* fixes
2024-11-21 19:53:00 -08:00
Awni Hannun
dcca0d7477 contiguous op / prim (#1612) 2024-11-21 19:51:49 -08:00
Awni Hannun
61d787726a Fix view scalar bug segfault (#1603)
* fix view scalar bug

* fix view scalar bug

* one more fix
2024-11-19 10:54:05 -08:00
Angelos Katharopoulos
5e89aace9b Fix concatenate vmap (#1600) 2024-11-19 10:44:04 -08:00