Faster synchronization Fence primitive (#1773)

* try faster synchronization

move event

fixes

update bench

fix

fix

* non-functioning kernel

* try alternative fence

* cleanup barrier

* get rid of event_fence

* update benchmarks

* doc string in metal fence
This commit is contained in:
Awni Hannun
2025-01-17 18:42:19 -08:00
committed by GitHub
parent 0c259961ac
commit a4667da1eb
11 changed files with 362 additions and 31 deletions

View File

@@ -45,6 +45,9 @@ build_kernel(random)
build_kernel(rms_norm)
build_kernel(rope)
build_kernel(scaled_dot_product_attention sdpa_vector.h)
if(MLX_METAL_VERSION GREATER_EQUAL 320)
build_kernel(fence)
endif()
set(STEEL_HEADERS
steel/defines.h