zhangyiss/mlx - mlx - Gitea for Geophysics

mirror of https://github.com/ml-explore/mlx.git synced 2025-08-09 02:36:42 +08:00

Author	SHA1	Message	Date
Awni Hannun	29a620cab2	No reshapes in quantized embedding (#1682 ) * no reshapes in quantized embedding * fix inadvertant cast * add tol	2024-12-09 18:57:38 -08:00
Cheng	87d7a2520e	Use Py_ssize_t in python bindings (#1678 ) * Use Py_ssize_t in python bindings * Args passed to std::max must be same type	2024-12-09 12:59:19 -08:00
Awni Hannun	40c62c1321	Use int64 stride everywhere (#1671 ) * use int64 stride everywhere * fix ext * fix ext * more shape + cleanup * one more * few more	2024-12-09 11:09:02 -08:00
Awni Hannun	35b412c099	Fix compile hasher for string constants. (#1677 ) * fix hash * add test * nit	2024-12-09 09:26:18 -08:00
Cheng	d0f471cff7	Using math defines requires switch in MSVC (#1665 ) * Using math defines requires switch in MSVC * Fix more math macros * Fix type * Remove _MSC_VER guard for math defines	2024-12-08 08:16:28 -08:00
Cheng	6f316b8bf5	Use int64_t instead of ssize_t (#1673 )	2024-12-07 20:10:44 -08:00
Cheng	7c10c93a1f	Convert filesystem path to std::string explicitly (#1672 )	2024-12-07 20:10:06 -08:00
Cheng	d92ea094f1	Use && instead of and (#1663 ) * Use && instead of and * Remove "and" in ops.cpp	2024-12-07 18:26:39 -08:00
Cheng	6ae5423b4a	Do not pass integers to isnan (#1664 )	2024-12-07 18:26:23 -08:00
Cheng	9635cffdc8	Include io.h in MSVC for IO functions (#1661 )	2024-12-07 18:26:06 -08:00
Cheng	96986fb362	Use auto* for pointers (#1662 )	2024-12-07 18:25:40 -08:00
Cheng	3ceb341a75	Use correct complex type for MSVC (#1660 )	2024-12-07 18:25:22 -08:00
Awni Hannun	50fa705125	patch bump (#1656 )	2024-12-06 13:16:19 -08:00
Awni Hannun	69a2991614	allow compiling lambdas in C++ (#1650 ) * allow compiling lambdas in C++ * fix test * more tests * auto detect capture-less lambda	2024-12-06 13:13:21 -08:00
mt_caret	fd3377dd1f	Support bias correction in Adam and AdamW optimizers (#1640 )	2024-12-06 12:13:34 -08:00
Awni Hannun	d0b6cb0425	More primitives for compiling with shapeless (#1653 ) * more shapeless and more Shape * more shape * fix * fix	2024-12-06 11:29:18 -08:00
Alex Barron	95c4a2e3af	add back conditionaltype (#1655 )	2024-12-06 11:12:01 -08:00
Awni Hannun	bc2a29f033	fix (#1654 )	2024-12-06 10:48:58 -08:00
Nripesh Niketan	3bb5b4a302	Chore: Add default language in pre-commit and bump hooks (#1652 )	2024-12-06 07:54:29 -08:00
Awni Hannun	fc88fd9097	Shape and Strides 1 / N (#1645 ) * shape and stride type def * more shape	2024-12-05 12:53:43 -08:00
Awni Hannun	c5b0928c1f	fix fallback (#1646 )	2024-12-05 11:59:53 -08:00
Awni Hannun	e047fd977d	compile changes if stream changes (#1644 )	2024-12-03 14:37:44 -08:00
Jagrit Digani	9d40e521d7	Stop matrix copies with new attention kernel (#1639 )	2024-12-02 14:12:38 -08:00
Alex Barron	1445dcaa60	let class predicate specify quantization parameters (#1638 )	2024-12-02 14:09:28 -08:00
Jesper Stemann Andersen	e4eeb4e910	Added missing unordered_map includes (#1635 ) * Added missing includes in mlx/io.h and mlx/backend/metal/metal.h * Added additional missing unordered_map includes that fixes build on FreeBSD	2024-12-02 07:03:03 -08:00
Awni Hannun	aa86876813	fix transformer decoder post norm LN (#1637 )	2024-12-02 07:02:17 -08:00
Jesper Stemann Andersen	974bb54ab2	CMake: Enabled using Accelerate on x86_64 / x64 (#1625 ) * CMake: Enabled using Accelerate on x86_64 / x64 Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761 * CMake: Removed superfluous MLX_BUILD_ARM	2024-11-28 10:55:45 -08:00
Ikko Eltociear Ashimine	9bc2183a31	docs: update device.cpp (#1632 ) unecessary -> unnecessary	2024-11-27 20:58:26 -08:00
Awni Hannun	d4b222b6d3	Fix some leaks and races (#1629 ) * fix leak and fix potential race * more leak fixes * fix one more	2024-11-27 20:01:20 -08:00
Jesper Stemann Andersen	af2af818a6	Enables build for -linux-musl (#1627 ) Also contributes to being able to build for -w64-mingw32. Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761	2024-11-27 13:14:24 -08:00
Jesper Stemann Andersen	698e63a608	CMake: Build with dlfcn-win32 to have dlopen etc. on win32 (#1628 ) Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/9761	2024-11-27 13:14:13 -08:00
Awni Hannun	211411faf2	fix large ops (#1620 )	2024-11-24 09:17:10 -08:00
Awni Hannun	bb303c45a5	version (#1617 )	2024-11-22 12:00:03 -08:00
Alex Barron	6f7986d592	Cleaner `qmv`/`qvm` (#1616 )	2024-11-22 11:14:08 -08:00
Awni Hannun	7cbb4aef17	Doc fix (#1615 )	2024-11-22 11:12:25 -08:00
Jagrit Digani	02bec0bb6d	Matrix Attention kernel (#1610 ) * Rough INIT * [WIP]: Loading and Matmuls added * [WIP]: Reductions and min working aligned kernel at headdim = 64 * [WIP] Added headdim 80 for testing * [WIP] Update dispatch params for testing * [WIP] Add support for unaligned seq lengths - still looks messy * Update sdpa_benchmarks * Update sdpa_benchmarks * Update sdpa_benchmarks * Enable gqa support * Update benchmark and switch off 128 headdim * Update headdim 128 tuning * Remove older fast attention code. Write out O strided * Disable hd=128 until further optimizations * Enable bf16 * Fix data size bug * Enable attn build outside of jit	2024-11-22 10:34:05 -08:00
Alex Barron	c79f6a4a8c	3 and 6 bit quantization (#1613 ) * Support 3 and 6 bit quantization	2024-11-22 10:22:13 -08:00
Awni Hannun	0c5eea226b	Reduce specializations (#1607 ) * start of reduce specializations * fix all reduce * fix many dims * fix * non-jit tests clear * cleanup instantiations * cpu merges * change dim specializations * optimize * fix jit * fix jit * use higher precision for integer sum+prod * fixes	2024-11-21 19:53:00 -08:00
Awni Hannun	dcca0d7477	contiguous op / prim (#1612 )	2024-11-21 19:51:49 -08:00
Cocoa	0d5e7716ad	fix typo: accross -> across (#1609 ) Signed-off-by: Cocoa <i@uwucocoa.moe>	2024-11-20 15:30:51 -08:00
Angelos Katharopoulos	d8c824c594	Formatting fixes (#1606 )	2024-11-20 15:30:36 -08:00
Saanidhya	cb431dfc9f	Adds 3D pooling (#1526 )	2024-11-19 16:45:24 -08:00
Awni Hannun	61d787726a	Fix view scalar bug segfault (#1603 ) * fix view scalar bug * fix view scalar bug * one more fix	2024-11-19 10:54:05 -08:00
Angelos Katharopoulos	5e89aace9b	Fix concatenate vmap (#1600 )	2024-11-19 10:44:04 -08:00
Awni Hannun	2af7e8a9a6	fix cmake version (#1601 )	2024-11-19 08:45:05 -08:00
Awni Hannun	2419edd5b2	Faster indexing math in a few kernels (#1589 ) * wip: faster compiled kernels * faster general unary with uint specialization * index type in compiled, unary, binary, ternary, copy * fix jit * jit fix * specialize gather + scatter * nit in docs	2024-11-18 19:52:00 -08:00
Awni Hannun	bf481e8e5d	Fix sibling leak (#1590 ) * add test * fix + test * fix fix	2024-11-18 19:17:01 -08:00
Awni Hannun	9d7fa6b8e6	Use osx deployment target to pick Metal version (#1595 ) * choose metal based on deployment target rather than system version * nit * unused compile def	2024-11-18 19:16:49 -08:00
Angelos Katharopoulos	073076ac7d	2-Pass Sdpa Inference Kernel (#1597 )	2024-11-18 17:31:53 -08:00
Awni Hannun	9bd03dd9b4	More buffer donation with no-ops (#1591 ) * more donation * fix test * fix build	2024-11-18 08:35:41 -08:00

1 2 3 4 5 ...

847 Commits