
* PyTorch: build flash attention by default, except in CI * Variant is boolean, only available when +cuda/+rocm * desc -> _desc
* PyTorch: build flash attention by default, except in CI * Variant is boolean, only available when +cuda/+rocm * desc -> _desc