NVIDIA CUDA 13.1 Drops CUB Boilerplate with New Single-Call API