-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Two possible new tail strategies that could help:
RoundUpAndBlend, which would be like GuardWithIf but it would use a select instead of an if, so if GuardWithIf does something equivalent to:
f(x) = if x < extent then g(x) else dontcare
RoundUpAndBlend would do:
f(x) = select(x < extent, g(x), f(x))
I.e. it loads the vector it would store to, modifies some of the lanes, and then stores the result. This would be a race condition if there's an outer parallel loop in that dimension, so we'd have to check for that.
ShiftInwardsAndBlend would be similar, but shifting inwards instead of rounding up, so that the overall allocation bounds aren't expanded if the extent is at least one vector. It would be really useful for vectorizing pure vars in update definitions touching inputs and outputs when you expect the extent to be small.
Specifically, I want to use this schedule:
output.update().specialize(output.width() < vec);
output.update().vectorize(x, vec, TailStrategy::ShiftInwardsAndBlend);