Angelos Katharopoulos
199aebcf77
Change the variance computation ( #319 )
2024-01-30 19:28:56 -08:00
Angelos Katharopoulos
0de5988f92
Custom VJP and checkpointing ( #541 )
...
* Implement custom_vjp and checkpointing
* Add a dependency management primitive
* Change the eval order to deep branches first
* Add graph depth tracking to the array
2024-01-30 16:04:45 -08:00
Jagrit Digani
375446453e
Update Compute Pipeline Creation API ( #581 )
...
* Add option to specialize metal functions on function constants
* Update Compute Pipeline Creation API
* Add options to make libraries from source and stitching
* Update function specialization name options
2024-01-30 15:42:36 -08:00
Angelos Katharopoulos
1895d34c20
Fix log1p with inf inputs ( #592 )
2024-01-30 14:02:50 -08:00
Jacket
3f7aba8498
Implement diagonal operator ( #562 )
...
* Implement diagonal operator
This implements mx.diagonal in operator level, inspired by
@ManishAradwad.
* added `mx.diag` with tests
* corrected few things
* nits in bindings
* updates to diag
---------
Co-authored-by: ManishAradwad <manisharadwad@gmail.com >
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-30 09:45:48 -08:00
Angelos Katharopoulos
65d0b8df9f
Fix binary op dispatch ( #584 )
2024-01-29 19:36:17 -08:00
Awni Hannun
3c2f192345
Propagate nans in binary ops ( #579 )
...
* propagate nans in binary ops
* handle empty matmul
* cpu minimum/maximum propagate nan
* benchmark maximum
* add min as well
* throw on negative indices with full
* verbose on linux
* fix matmul for zero K
2024-01-29 11:19:38 -08:00
Awni Hannun
8993382aaa
Buffer Donation ( #519 )
...
* buffer donation
* fix to move shared pointer
* format
* gpu in place for copy and binary
* revert ops test
* cpu in place
* a little cleanup
* remove useless bench
2024-01-26 16:30:33 -08:00
Awni Hannun
07f35c9d8a
Fix a few issues: docs for flatten, erf, dequantize validation ( #560 )
...
* doc flatten
* erf doc
* check values for dequantize
* format
2024-01-26 15:16:46 -08:00
Jagrit Digani
bf17ab5002
Add more checks and clearer error messages to conv operations ( #563 )
...
* Add more checks and clearer error messages to conv operations
2024-01-26 15:13:26 -08:00
Awni Hannun
8fa6b322b9
Compile front-end ( #476 )
...
* fix tests for linux
* make a move on compile
* basic compile scaffold works
* compile binding
* clean
* fix
* fix grad, more tests
* basic python tests
* fix segfault on python exit
* compile works with python closures
* fix test
* fix python globals bug, and erase
* simplify
* more cpp tests
* bug fix with move function and compile at exit
* simplify inputs also
* enable and disable compiler
* remove simplify
* simplify tests use compile now
* fix multi-output with compile
* clear output tree from cache when function goes out of scope
* ../python/src/transforms.cpp
* remove closure capture
* comments
2024-01-26 13:45:30 -08:00
taher
077c1ee64a
QR factorization ( #310 )
...
* add qr factorization
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-26 09:27:31 -08:00
Rifur13
2463496471
[Fix] mx.allclose bug with infinite values ( #539 )
...
* Added isclose op and fixed comparison with inf values
* Added 'equal_nan' to match numpy
* format
* Add test
* Update python/src/ops.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Update python/src/ops.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* Addressed CR comments
* Update python/src/ops.cpp
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
* nits
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com >
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-25 20:47:06 -08:00
Awni Hannun
f27ec5e097
More helpful error message in vjp transform + concate bug ( #543 )
...
* more helpful message in vjp transform
* fix concatenate on mismatch dims
* typo
* typo
2024-01-24 09:58:33 -08:00
Awni Hannun
f30e63353a
Minor updates to address a few issues ( #537 )
...
* docs on arg indices return type
* arange with nan
* undo isort
2024-01-23 22:24:41 -08:00
Juarez Bochi
4fe2fa2a64
GGUF: Avoid dequantization when format is compatible ( #426 )
...
* GGUF: Don't dequantize q4_1
* Fix weight order. First in low bits
* Add unpacking for q4_0
* Don't dequantize q8_0
* rebase quants and split file
* don't quantize every weight
* reapply patch
* error handling
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-23 15:43:57 -08:00
Jagrit Digani
6d3bee3364
Fix oob reads in gemv kernel ( #523 )
2024-01-22 12:06:04 -08:00
Awni Hannun
7a34e46677
Quantize with groups of 32 ( #511 )
...
* allow quantize with group sizes of 32
* missing cpu dispatch
* remove print
* Fix qvm for group_size 32
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
2024-01-21 06:19:05 -08:00
Awni Hannun
b207c2c86b
Power VJP fix for 0 ( #505 )
2024-01-20 01:17:40 -08:00
Juarez Bochi
ddf50113c5
GGUF: Load and save metadata ( #446 )
...
* gguf metadata
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-19 14:06:05 -08:00
Awni Hannun
c4ec836523
fix isinf for integer types ( #494 )
2024-01-19 05:31:10 -08:00
Awni Hannun
3d99a8d31d
Fix format / build ( #489 )
2024-01-18 10:01:59 -08:00
Ethan
a749a91c75
Support disable metal buffer cache to prevent performance degradation caused by large memory caching ( #390 )
...
* support disable metal buffer cache, due to large unused memory buffered when llm generated long context tokens
* Run format and add "cache_enabled" feature tests
2024-01-18 08:33:34 -08:00
toji
49a52610b7
Added formatter structure and a boolean value formatter ( #354 )
...
* added formatter structure and a boolean value formatter
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-18 07:49:41 -08:00
Angelos Katharopoulos
9c111f176d
Fix split optimization for array iterator ( #484 )
2024-01-18 05:50:25 -08:00
Angelos Katharopoulos
90c234b7ac
Fix round to round half-cases to even ( #482 )
2024-01-17 15:27:23 -08:00
Angelos Katharopoulos
135fd796d2
Fix detach for multi-output primitives ( #480 )
2024-01-17 14:08:07 -08:00
Jagrit Digani
78102a47ad
Update GEMM ( #424 )
...
* Organize and collect metal subroutine templates and elements in `metal/kernels/steel/`
* Update gemm elements for better performance
* Add split-K specialization for gemm
* Add `addmm` primitive, op and bindings for fused matmul and bias addition
* Update tests and benchmarks as needed
2024-01-17 12:42:39 -08:00
Diogo
556cdf0e06
Resolves build issues with the extension example ( #419 )
...
* resolved extension build issues and added test to ci
* missing gguflib
* rebased
* force mlx install from fix branch
* linux build issue
* point to git install and comment out ci tests
2024-01-17 12:07:05 -08:00
Awni Hannun
275db7221a
Command buffer reports errors ( #479 )
...
* command buffer reports errors
* typo
* simplify
2024-01-17 11:53:30 -08:00
Awni Hannun
a2bf7693dd
Primitive's VJP takes outputs as input ( #475 )
...
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
2024-01-16 19:03:53 -08:00
Angelos Katharopoulos
d8fabaa12b
Split multi output ( #461 )
...
* Multi-output split primitive
* Add the multi-output split to the ArrayIterator
* Add some grad tests for split
2024-01-16 13:33:55 -08:00
Avikant Srivastava
4e290d282f
feat: add time based seed to random.h ( #457 )
...
* random seed from time
* fix: chrono
* refactor: snake case
2024-01-16 07:32:28 -08:00
Yashraj Singh
e72458a3fa
implemented isposinf and isneginf in one PR ( #470 )
...
* ran precommit
* updated docs
2024-01-16 06:48:07 -08:00
Awni Hannun
a2ffea683a
Fix eye for larger matrices ( #463 )
...
* fix eye
* fix scatter for <32bit (non native atomic) types
* fix int overflow
2024-01-16 00:51:24 -08:00
Angelos Katharopoulos
c15fe3e61b
Allow arbitrary first dimension in quantization kernels. ( #458 )
...
* Allow arbitrary first dim on qmm_t and qmv
* Allow arbitrary first dim on qmm and qvm
* Specialized aligned vs unaligned case
* Add more checks for valid quantizations
2024-01-16 00:46:21 -08:00
Tristan Bilot
f44c132f4a
Add scatter_min VJP ( #462 )
2024-01-16 00:37:40 -08:00
Matthew Ernst
92a2fdd577
Adds isinf ( #445 )
...
* adds isinf
Signed-off-by: matthewfernst <matthew.f.ernst@gmail.com >
* use stream + nits
* typo
---------
Signed-off-by: matthewfernst <matthew.f.ernst@gmail.com >
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-15 19:50:44 -08:00
Tristan Bilot
6022d4129e
scatter_max vjp + bindings + tests ( #431 )
...
Co-authored-by: DjamelMesbah <djamel.mesbah@adservio.fr >
2024-01-14 14:12:15 -08:00
Awni Hannun
4bc446be08
Use a dummy primitive to only sync with one output ( #453 )
...
* Use a dummy primitive to only sync with one output
* Fix test and choose stream with slight care
2024-01-14 14:09:40 -08:00
Awni Hannun
41cc7bdfdb
Fix stub generation, change graph exporting for arrows to go to outputs ( #455 )
2024-01-14 14:06:16 -08:00
Awni Hannun
6e81c3e164
Sync only with outputs we need to sync with ( #447 )
2024-01-13 01:47:25 -08:00
Diogo
2e29d0815b
Add tile op ( #438 )
2024-01-12 23:03:16 -08:00
Ayush Shridhar
1416e7b664
Add isnan ( #423 )
2024-01-12 11:16:48 -08:00
Angelos Katharopoulos
006d01ba42
Fix packaging of gguflib ( #435 )
2024-01-11 13:56:03 -08:00
Awni Hannun
c9934fe8a4
Metal validation ( #432 )
...
* tests clear metal validation
* add cpp test with metal validation to circleci
* nit
2024-01-11 11:57:24 -08:00
Awni Hannun
3b4f066dac
Correct types for vjp + tests ( #418 )
...
* correct types for vjp + tests
* fix build + comment
2024-01-10 13:32:37 -08:00
Juarez Bochi
b7f905787e
GGUF support ( #350 )
...
* Initial GGUF support for tensor fields.
---------
Co-authored-by: Awni Hannun <awni@apple.com >
2024-01-10 13:22:48 -08:00
Angelos Katharopoulos
961435a243
Scatter vjp ( #394 )
...
* Add a first scatter vjp
* Implement the scatter_add vjp
* Add array.at to implement user friendly scatters
2024-01-09 13:36:51 -08:00
Awni Hannun
f099ebe535
Multi output primitives ( #330 )
...
* Multi-output primitives
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com >
2024-01-08 16:39:08 -08:00