mirror of
https://github.com/ml-explore/mlx.git
synced 2025-12-16 01:49:05 +08:00
Some checks failed
Nightly Build / build_linux_release (3.10) (push) Has been cancelled
Nightly Build / build_linux_release (3.14) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.10) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.11) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.12) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.13) (push) Has been cancelled
Nightly Build / build_linux_with_tests (3.14) (push) Has been cancelled
Nightly Build / build_mac_release (3.10) (push) Has been cancelled
Nightly Build / build_mac_release (3.13) (push) Has been cancelled
Nightly Build / build_cuda_with_tests (push) Has been cancelled
Nightly Build / build_cuda_release (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (aarch64) (push) Has been cancelled
Nightly Build / Linux Fedora CPP Build (x86_64) (push) Has been cancelled
* Use async cuda malloc managed with cuda 13 * add pool threshold * refactor for regular cuda malloc * load eval gpu for cuda * remove use of cuda pool, use cuda free async * fix * fix * fix * fix * fix + comment
40 lines
1.1 KiB
C++
40 lines
1.1 KiB
C++
// Copyright © 2024 Apple Inc.
|
|
|
|
#include <vector>
|
|
|
|
#include "mlx/array.h"
|
|
|
|
namespace mlx::core {
|
|
|
|
/* A fence to be used for synchronizing work between streams.
|
|
*
|
|
* Calls to `wait` wait in the given stream until all previous calls to update
|
|
* are complete on their given stream.
|
|
*
|
|
* The array passed to `update` is computed and visible after the call to
|
|
* `wait` returns. The array passed to `wait` will not be read until all
|
|
* previous calls to `update` have completed.
|
|
*
|
|
* Note, calls to `update` should always be from the same thread or explicitly
|
|
* synchronized so that they occur in sequence. Calls to `wait` can be on any
|
|
* thread.
|
|
*
|
|
* For the Metal back-end the fence supports slow (default) and fast mode.
|
|
* Fast mode requires setting the environment variable
|
|
* `MLX_METAL_FAST_SYNCH=1`. Fast mode also requires Metal 3.2+ (macOS 15+,
|
|
* iOS 18+).
|
|
*/
|
|
class Fence {
|
|
public:
|
|
Fence() {};
|
|
explicit Fence(Stream stream);
|
|
|
|
void update(Stream stream, const array& x, bool cross_device);
|
|
void wait(Stream stream, const array& x);
|
|
|
|
private:
|
|
std::shared_ptr<void> fence_{nullptr};
|
|
};
|
|
|
|
} // namespace mlx::core
|