Skip to content

Conversation

tiagokepe
Copy link
Contributor

Hello all,

This PR enables ORDER BY ColumnNumber with collations like SELECT 'a' ORDER BY 1 COLLATE NOCASE.

if (collation.child->expression_class == ExpressionClass::CONSTANT) {
auto &constant = collation.child->Cast<ConstantExpression>();
auto index = (idx_t)constant.value.GetValue<int64_t>() - 1;
D_ASSERT(index < extra_list->size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is user input, right?
So we should probably not use D_ASSERT and instead check if the value is within the valid range, otherwise throw an InvalidInputException

auto &collation = expr->Cast<CollateExpression>();
if (collation.child->expression_class == ExpressionClass::CONSTANT) {
auto &constant = collation.child->Cast<ConstantExpression>();
auto index = (idx_t)constant.value.GetValue<int64_t>() - 1;
Copy link
Contributor

@Tishj Tishj Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this needs a NumericCast, and we should add a test with a negative value, and 0

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good - one comment:

INSERT INTO collate_test VALUES ('ã'),('B'),('a'),('A');

query I
SELECT s FROM collate_test ORDER BY 1 COLLATE NOCASE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the tests @Tishj requested could we add some tests with more complex expressions as well? i.e.

SELECT CONCAT(s, s) FROM collate_test ORDER BY 1 COLLATE NOCASE;
SELECT (SELECT s) FROM collate_test ORDER BY 1 COLLATE NOCASE;
SELECT CONCAT(s, s) AS concat, concat FROM collate_test ORDER BY 2 COLLATE NOCASE;
SELECT CASE WHEN s[1]='a' THEN s ELSE NULL END FROM collate_test ORDER BY 1 COLLATE NOCASE;
SELECT collate_test.s FROM collate_test ORDER BY 1 COLLATE NOCASE;

Copy link
Contributor Author

@tiagokepe tiagokepe Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Tishj and @Mytherin for the suggestions.

I've applied the NumericCast and added more test cases.

The only one that failed references a sub-query in the select list:
SELECT (SELECT s) FROM collate_test ORDER BY 1 COLLATE NOCASE;

But this case doesn't work with alias as well:

SELECT (SELECT s) AS c1 FROM collate_test ORDER BY c1 COLLATE NOCASE;
Binder Error: Alias "c1" referenced in a SELECT clause - but the expression has a subquery. This is not yet supported.

@github-actions github-actions bot marked this pull request as draft March 14, 2024 17:18
@tiagokepe tiagokepe marked this pull request as ready for review March 15, 2024 08:36
@github-actions github-actions bot marked this pull request as draft March 18, 2024 08:10
@tiagokepe tiagokepe marked this pull request as ready for review March 18, 2024 08:11
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates - some more comments:

auto &sel_entry = extra_list->at(index);
if (!sel_entry->alias.empty()) {
vector<string> column_names = {sel_entry->alias};
auto colref = make_uniq<ColumnRefExpression>(std::move(column_names));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow what the goal of this code path is, e.g. it seems like this would not work:

SELECT 'a' AS my_alias ORDER BY 1 COLLATE NOCASE;

Perhaps we can just remove this code path and always use copy?

colref->query_location = sel_entry->query_location;
collation.child = std::move(colref);
} else { // no alias, e.g.: SELECT 'a' ORDER BY 1 COLLATE NOCASE
if (sel_entry->expression_class == ExpressionClass::SUBQUERY) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use HasSubquery(), otherwise e.g. this will result in an error:

SELECT concat('a', (SELECT s)) FROM collate_test ORDER BY 1 COLLATE NOCASE;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Mytherin, I've applied all suggestions and added extra test cases.

@github-actions github-actions bot marked this pull request as draft March 18, 2024 11:29
@tiagokepe tiagokepe marked this pull request as ready for review March 18, 2024 11:29
@Mytherin Mytherin merged commit 60489fa into duckdb:main Mar 18, 2024
@Mytherin
Copy link
Collaborator

Thanks! LGTM

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 18, 2024
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Mar 23, 2024
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 28, 2024
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants