Skip to content

Conversation

Mytherin
Copy link
Collaborator

This PR changes the AggregateExecutor interface to be simpler and no longer have pointers and arrays - but instead have references everywhere and use optional_ptr for optionals (such as the bind data). In addition we add some extra structures like AggregateUnaryInput that offer helper methods which further cleans up the interface.

Note that the actual AggregateFunction interface is unchanged - this only changes the AggregateExecutor helper that allows for easier implementation of aggregates (and that is used by most of our aggregate functions).

New Interface

template <class STATE>
void Initialize(STATE &state) {
	state.count = 0;
	state.sum = 0;
}

template <class INPUT_TYPE, class STATE, class OP>
void Operation(STATE &state, const INPUT_TYPE &input, AggregateUnaryInput &) {
	state.sum += input;
	state.count++;
}

template <class STATE, class OP>
void Combine(const STATE &source, STATE &target, AggregateInputData &) {
	target.count += source.count;
	target.sum += source.sum;
}

template <class T, class STATE>
void Finalize(STATE &state, T &target, AggregateFinalizeData &finalize_data) {
	if (state.count == 0) {
		finalize_data.ReturnNull();
	} else {
		target = state.sum / state.count;
	}
}

Old Interface

template <class STATE>
void Initialize(STATE *state) {
	state->count = 0;
	state->sum = 0;
}

template <class INPUT_TYPE, class STATE, class OP>
void Operation(STATE *state, AggregateInputData &, const INPUT_TYPE *input, ValidityMask &mask, idx_t idx) {
	state->sum += input[idx];
	state->count++;
}

template <class STATE, class OP>
void Combine(const STATE &source, STATE *target, AggregateInputData &) {
	target->count += source.count;
	target->sum += source.sum;
}

template <class T, class STATE>
void Finalize(Vector &result, AggregateInputData &, STATE *state, T *target, ValidityMask &mask, idx_t idx) {
	if (state->count == 0) {
		mask.SetInvalid(idx);
	} else {
		target[idx] = state->sum / state->count;
	}
}

One side-effect of this change is that the Count function can no longer use the AggregateExecutor - as that was previously reliant on ignoring the input data array to support input data of all types, which it can no longer do now that it is a reference instead of a pointer.

@Mytherin Mytherin merged commit 662c10c into duckdb:master May 25, 2023
@Mytherin Mytherin deleted the safeaggregates branch July 5, 2023 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant