mlx/mlx/backend/metal
Awni Hannun 42afe27e12
std and expm1 (#973)
* std and expm1

* actually add expm1

* fix linux

* fix vjp

* relax tol for linux test

* Add it to the compilable primitives

---------

Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-04-08 14:26:01 -07:00
..
kernels std and expm1 (#973) 2024-04-08 14:26:01 -07:00
mps copyright + ack 2023-11-30 11:12:53 -08:00
allocator.cpp Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00
allocator.h Some fixes in cache / thread safety (#777) 2024-03-05 13:30:50 -08:00
CMakeLists.txt Adds mx.fast.layer_norm (#870) 2024-03-21 13:55:51 -07:00
compiled_preamble.h Kernel generation (#614) 2024-02-07 13:15:59 -08:00
compiled.cpp Fix cpu compile (#934) 2024-04-01 17:37:12 -07:00
conv.cpp No copy gems (#801) 2024-03-12 13:13:41 -07:00
copy.cpp Fix copy donation and add partial rope (#881) 2024-03-22 17:28:26 -07:00
copy.h Add a SliceUpdate op and primitive (#850) 2024-03-20 10:39:25 -07:00
device.cpp Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00
device.h Kernel generation (#614) 2024-02-07 13:15:59 -08:00
fft.cpp copyright + ack 2023-11-30 11:12:53 -08:00
indexing.cpp Some C++ code are not needed (#841) 2024-03-18 17:04:10 -07:00
make_compiled_preamble.sh quote file name (#670) 2024-02-11 10:33:30 -08:00
matmul.cpp add numeric type hierarchy and issubdtype as well as a set_dtype meth… (#427) 2024-03-25 12:32:59 -07:00
matmul.h No copy gems (#801) 2024-03-12 13:13:41 -07:00
metal_impl.h Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00
metal.cpp Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00
metal.h Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00
normalization.cpp segfaut layer norm grad (#955) 2024-04-04 10:59:15 -07:00
primitives.cpp std and expm1 (#973) 2024-04-08 14:26:01 -07:00
quantized.cpp Fix nan and improve speed for qvm (#903) 2024-03-26 10:41:45 -07:00
reduce.cpp Implement vjps for some primitives in the fast namespace (#883) 2024-03-26 16:35:34 -07:00
reduce.h Implement vjps for some primitives in the fast namespace (#883) 2024-03-26 16:35:34 -07:00
rope.cpp Implement vjps for some primitives in the fast namespace (#883) 2024-03-26 16:35:34 -07:00
scaled_dot_product_attention.cpp add numeric type hierarchy and issubdtype as well as a set_dtype meth… (#427) 2024-03-25 12:32:59 -07:00
scan.cpp copyright + ack 2023-11-30 11:12:53 -08:00
softmax.cpp Option for precise softmax (#953) 2024-04-04 08:32:35 -07:00
sort.cpp Fix multiblock sort limits (#906) 2024-03-26 14:00:00 -07:00
utils.h Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00