- Current documentation includes benchmarks between out-dated versions of cub and thrust like v1.7.1 (DeviceReduce https://nvlabs.github.io/cub/structcub_1_1_device_reduce.html) - Current Thrust uses cub internally for some algorithms (e.g DeviceReduce) - It would be nice to have benchmarks on Volta and newer architectures.