-
Notifications
You must be signed in to change notification settings - Fork 2
Adding equivalent relations rebase on master with parquet #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding equivalent relations rebase on master with parquet #3
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments:
Some code style issues that popped up in multiple places
- Use
nullptr
instead ofNULL
- CamelCase for functions instead of snake_case
Some of these issues are also the reason that the CI is failing right now.
One of the main things I found is that there are a few lengthy functions that have many nested control flow statements. I've added a comment on how to reduce the number of nested statements.
As for the function length, there were only a few places where this is an issue. I suggest trying to avoid code re-use as much as possible, and splitting code into multiple functions (but only where it makes sense).
Finally, we talked about estimating cardinality iteratively (e.g., join t1 x t2, then t3), and estimating the cardinality in one go (i.e., adding all t1, t2, t3 properties to a single formula and computing the cardinality).
When reasoning about join ordering, I prefer the former.
However, implementation-wise I strongly prefer the latter. For further development, this makes it much harder to 'break' the cardinality estimation function. By estimating the cardinality in one go we ensure we always get the same estimate, no matter in which order we join the tables.
@@ -85,7 +86,9 @@ class PhysicalOperator { | |||
public: | |||
PhysicalOperator(PhysicalOperatorType type, vector<LogicalType> types, idx_t estimated_cardinality) | |||
: type(type), types(std::move(types)), estimated_cardinality(estimated_cardinality) { | |||
ph_join_stats = make_unique<JoinStats>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not make much sense that every physical operator gets a JoinStats
field. It only has estimated cost/cardinality. Can we rename this to something more general, i.e., EstimatedProperties
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same goes for the logical operators
@@ -54,98 +54,150 @@ unique_ptr<PhysicalOperator> PhysicalPlanGenerator::CreatePlan(unique_ptr<Logica | |||
|
|||
unique_ptr<PhysicalOperator> PhysicalPlanGenerator::CreatePlan(LogicalOperator &op) { | |||
op.estimated_cardinality = op.EstimateCardinality(context); | |||
unique_ptr<PhysicalOperator> plan = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use nullptr
instead of NULL
op.has_estimated_cardinality = true; | ||
} else { | ||
op.join_stats = make_unique<JoinStats>(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than copying each field individually here, can we implement a Copy
method for JoinStats
(or whatever name we decide to give it)?
#include "duckdb/storage/statistics/distinct_statistics.hpp" | ||
#include "duckdb/planner/table_filter.hpp" | ||
|
||
#include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused import
void UpdateCost(); | ||
|
||
TableCatalogEntry *GetCatalogTableEntry(LogicalOperator *op); | ||
static bool key_exists(idx_t key, unordered_map<idx_t, double> stat_column); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CamelCase instead of snake_case for functions
src/optimizer/join_node.cpp
Outdated
|
||
namespace duckdb { | ||
|
||
static const double default_selectivity = 0.2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest using static constexpr double
, and re-naming to CAPITAL letters (to conform with code style)
src/optimizer/join_node.cpp
Outdated
} else if (it->second->filter_type == TableFilterType::CONJUNCTION_OR) { | ||
ConjunctionOrFilter &fil = (ConjunctionOrFilter &)*it->second; | ||
for (auto &child_filter : fil.child_filters) { | ||
if (child_filter->filter_type == TableFilterType::CONSTANT_COMPARISON) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent nesting the code more and more like you are doing here:
for (auto &expr : exprs) {
if (expr.property == ...) {
auto &derived = (DerivedClass &)*expr;
if (derived.property == ...) {
// Do something
}
}
}
I would suggest using the following pattern:
for (auto &expr : exprs) {
if (expr.property != ...) {
continue;
}
auto &derived = (DerivedClass &)*expr;
if (derived.property != ...) {
continue;
}
// Do something
}
It requires a few more lines of code, but results in more readable code because the level of nested if
s stays manageable, making lines less wide.
src/optimizer/join_node.cpp
Outdated
return stats; | ||
} | ||
|
||
void JoinNode::InitTDoms(JoinOrderOptimizer *optimizer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really long method with tons of nested for
/if
-statements. Can we split this up into methods with descriptive names and try to reduce the nestedness?
parent->children[0] = move(join_tree.second); | ||
return plan; | ||
} | ||
|
||
idx_t JoinOrderOptimizer::readable_hash(idx_t table, idx_t col) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we wrap this into a struct instead of creating this somewhat human-readable integer?
We already have all the code you need for this. The table, col
pair can be stored in a ColumnBinding
, and we already have a way to create an unordered_map
that goes from ColumnBinding -> whatever
, which is the column_binding_map_t
, defined in column_binding_map.hpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you're also using it in an unordered_map
, you can also easily create a column_binding_set_t
by re using the same code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some good changes! Putting everything in the CardinalityEstimator
class cleaned up the code nicely.
A few nitpicks here and there, mostly regarding code style. I think you can let CLion do more work for you by automatic some formatting and whatnot. This is how you enable it.
Also, you should be using column_binding_map_t
and column_binding_set_t
. I've explained this in a comment more clearly.
Anyways, good progress. Keep it up!
idx_t cost; | ||
double cardinality; | ||
|
||
unique_ptr<EstimatedProperties> Copy(unique_ptr<EstimatedProperties> estimated_props) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this copy function is formulated seems a bit strange to me. Why does it need an argument? Can we not do:
unique_ptr<EstimatedProperties> Copy() {
auto result = make_unique<EstimatedProperties>();
result->cost = cost;
result->cardinality = cardinality;
return result;
}
// namespace duckdb | ||
namespace std { | ||
template<> | ||
struct std::hash<duckdb::ColumnBinding> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need this hash function for column bindings. We already have a hash function for ColumnBinding
in column_binding_map.hpp
, which also defines column_binding_map_t
, which is a mapping from ColumnBinding
to whatever type you need, as wel as column_binding_set_t
.
Using this, you can have a map/set like this:
#include "duckdb/planner/column_binding_map.hpp"
// ...
auto my_map = column_binding_map<idx_t>(); // map from ColumnBinding to idx_t
ColumnBinding my_binding(4, 2);
my_map[my_binding] = 42; // lookup using column binding directly
auto my_set = column_binding_set_t();
my_set.insert(my_binding);
my_set.insert(my_binding); // was already in set
|
||
|
||
bool CardinalityEstimator::SingleColumnFilter(FilterInfo *filter_info) { | ||
if (!filter_info->left_set || !filter_info->right_set) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can still remove some of the nesting from this method:
if (filter_info->left_set && filter_info->right_set) {
// Both set
return false;
}
//! Filter on one relation, (i.e string or range filter on a column).
//! Grab the relation (should always be 1) and add it to the
//! the equivalence_relations
D_ASSERT(filter_info->set->count == 1);
for (auto &i_set : equivalent_relations) {
if (i_set.count(filter_info->left_binding) > 0) {
// Found an equivalent filter
return true;
}
}
D_ASSERT(filter_info->set->relations[0] == filter_info->left_binding.table_index);
auto key = ColumnBinding(filter_info->set->relations[0],
filter_info->left_binding.column_index);
unordered_set<ColumnBinding> tmp({key});
equivalent_relations.push_back(tmp);
return true;
We can immediately return false if the condition doesn't hold, which removes one level of nesting for the whole method. Then, we looking for equivalent filters, we can immediately return true if we find one, which removes a level of nesting for the remainder of the method.
Not that nesting is evil or anything. Sometimes it can make code more readable! But in this case I think it's more readable with less nesting.
|
||
idx_t ind = 0; | ||
|
||
for (const unordered_set<ColumnBinding>& i_set : equivalent_relations) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a loop that needs an index, I'd suggest just using a for-loop, instead of a smart loop that increments an index:
for (idx_t ind = 0; ind < equivalent_relations.size(); ind++) {
// Feel free to use 'auto' instead of the more verbose 'unordered_set<ColumnBinding>&'
const auto &i_set = equivalent_relations[ind];
if (i_set.count(key) != 1) {
continue; // Remove 1 level of nesting from the loop
}
// Rest of the logic here
// Etc.
}
unordered_map<idx_t, unique_ptr<TableFilter>>::iterator it; | ||
bool has_equality_filter; | ||
idx_t cardinality_after_filters = cardinality; | ||
if (!table_filters) return cardinality_after_filters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our code style requires
if (!table_filters) {
return cardinality_after_filters;
}
In the top-level folder of the duckdb repo we have .clang-format
and .clang-tidy
files, which CLion understands. If you point it to these files CLion will help you with code style. You can even do command + option + L
to auto-format.
idx_t cardinality_after_filters = cardinality; | ||
if (!table_filters) return cardinality_after_filters; | ||
for (it = table_filters->filters.begin(); it != table_filters->filters.end(); it++) { | ||
if (it->second->filter_type == TableFilterType::CONJUNCTION_AND) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be made more concise:
for (const auto &filter_pair : table_filters->filters) {
auto &filter = filter_pair.second;
if (filter.filter_type == TableFilterType::CONJUNCTION_AND) {
// Etc.
}
}
for (auto &child_filter : fil.child_filters) { | ||
if (child_filter->filter_type == TableFilterType::CONSTANT_COMPARISON) { | ||
auto comparison_filter = (ConstantFilter &)*child_filter; | ||
if (comparison_filter.comparison_type == ExpressionType::COMPARE_EQUAL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove some more nesting by doing
if (child_filter->filter_type != TableFilterType::CONSTANT_COMPARISON) {
continue;
}
auto comparison_filter = (ConstantFilter &)*child_filter;
if (comparison_filter.comparison_type != ExpressionType::COMPARE_EQUAL) {
continue;
}
// Etc.
Of course, if you do this, it becomes more cumbersome to add an else
. However, if you're never going to add an else, I'd prefer to remove nesting.
In the end, this may be replaced by filtering a sample anyway, so not sure how much it matters.
@@ -125,9 +155,18 @@ bool JoinOrderOptimizer::ExtractJoinRelations(LogicalOperator &input_op, vector< | |||
// e.g. suppose we have (left LEFT OUTER JOIN right WHERE right IS NOT NULL), the join can generate | |||
// new NULL values in the right side, so pushing this condition through the join leads to incorrect results | |||
// for this reason, we just start a new JoinOptimizer pass in each of the children of the join | |||
|
|||
// Keep track of all of the filter bindings the new join order optimizer makes | |||
auto child_binding_maps = vector<unordered_map<ColumnBinding, ColumnBinding>>({}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With column_binding_map_t
, this can be:
vector<column_binding_map_t<ColumnBinding> child_binding_maps({});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking better already! We're getting closer to a PR.
I have some more comments:
@@ -100,7 +100,7 @@ class PhysicalOperator { | |||
vector<LogicalType> types; | |||
//! The estimated cardinality of this physical operator | |||
idx_t estimated_cardinality; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The estimated_cardinality
was in there before we added the EstimatedProperties
, but now it seems like it makes sense to merge them.
You could add a something like
idx_t GetEstimatedCardinality() {
return estimated_props->estimated_cardinality;
}
// //! Is the filter an equality filter (i.e. country_code = [us]) | ||
// bool has_equality_filter; | ||
//}; | ||
|
||
class EstimatedProperties { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're now using this EstimatedProperties
in more places than just the join order optimizer, since it's now a field of PhysicalOperator
as well.
Can we move EstimatedProperties
to its own header/implementation file, instead of being in the header/implementation file of JoinNode
?
res += "-----------\n"; | ||
std::cout << res << std::endl; | ||
} | ||
//void printRelation2tableNameMapping(unordered_map<idx_t, std::string> relation_to_table_name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR we'll need to clean some of this debugging code up. We don't want commented and/or unused code.
I imagine that some of it is no longer needed, but some of it will be nice to keep. If you look through the code base, you'll see that classes such as LogicalOperator
/DataChunk
/Pipeline
have ToString()
and Print()
methods.
Could you check if some of the debugging code can be removed and/or refactored to ToString()
and Print()
methods?
@@ -338,15 +329,14 @@ JoinNode *JoinOrderOptimizer::EmitPair(JoinRelationSet *left, JoinRelationSet *r | |||
#ifdef DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move the symmetry verification to another method? Could be a static method or a private method of the JoinOrderOptimizer.
void VerifySymmetry(args ...) {
#ifdef DEBUG
// ... verification code
#endif
}
This keeps the code for CreateJoinTree
more clean.
if (op->type == LogicalOperatorType::LOGICAL_FILTER) { | ||
if (op->children[0]->type == LogicalOperatorType::LOGICAL_GET) { | ||
get = (LogicalGet*)op->children.at(0).get(); | ||
if (IsLogicalGet(op)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this code would be cleaner with a switch statement:
switch (op->type) {
case LogicalOperatorType::LOGICAL_GET: {
// ...
}
case LogicalOperatorType::LOGICAL_FILTER: {
// ...
}
}
Then you don't need the IsLogicalGet
function, and it's easy to extend with more operator types.
@@ -194,20 +205,16 @@ void CardinalityEstimator::UpdateTotalDomains(JoinNode *node, LogicalOperator *o | |||
idx_t count = 0; | |||
|
|||
bool direct_filter = false; | |||
for (ite = relation_to_columns[relation_id].begin(); | |||
ite != relation_to_columns[relation_id].end(); ite++) { | |||
for (ite = relation_to_columns[relation_id].begin(); ite != relation_to_columns[relation_id].end(); ite++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer a smart loop here, which is less verbose. Something like:
for (auto &rel_and_col : relation_to_columns[relation_id]) {
// ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good progress! I've left some more comments.
The main feedback now is that we'd like to keep the DP and CE code separate as much as possible, making both easier to read. Of course we need to call into the CE code from the DP code at some point (e.g., initialisation and asking to predict cardinalities), but the code becomes much easier to read and more maintainable if everything that belongs to CE is in the CardinalityEstimator
class.
static constexpr double DEFAULT_SELECTIVITY = 0.2; | ||
|
||
class CardinalityEstimator { | ||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nitpick but we like to keep our methods and fields in separate public/private blocks, i.e.:
class CardinalityEstimator {
public:
explicit CardinalityEstimator(ClientContext &context) : context(context) {
}
// other public functions
public:
// public fields
private:
// private functions
private:
// private fields
private: | ||
ClientContext &context; | ||
|
||
//! ********* HELPERS FOR INIT TOTAL DOMAINS *********** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you write some descriptions for the functions/fields in this header? Especially for the functions that are not immediately clear:
//! Describe function
void Function();
Of course, EstimateCardinality
is very descriptive so it doesn't really need a description, but adding one anyway doesn't do any harm either.
// Both set | ||
return false; | ||
} | ||
//! Filter on one relation, (i.e string or range filter on a column). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these comments we use just //
, not //!
.
In general, //!
is reserved for describing a function/field, for example in a header file.
@@ -0,0 +1,379 @@ | |||
// | |||
// Created by Tom Ebergen on 7/6/22. | |||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Create by ...
should be removed
@@ -32,6 +27,8 @@ bool JoinOrderOptimizer::ExtractBindings(Expression &expression, unordered_set<i | |||
D_ASSERT(colref.binding.table_index != DConstants::INVALID_INDEX); | |||
// map the base table index to the relation index used by the JoinOrderOptimizer | |||
D_ASSERT(relation_mapping.find(colref.binding.table_index) != relation_mapping.end()); | |||
cardinality_estimator.relation_to_columns[relation_mapping[colref.binding.table_index]].insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make a function in CardinalityEstimator
for this? It's not really clear what this code does when reading it.
Something like this, or a more descriptive name that makes more sense to you:
cardinality_estimator.AddColumnToRelation(...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few more instances in this file where you add something to a map or other data structure that belongs to the CardinalityEstimator
.
Touching the data structures that belong to CardinalityEstimator
from outside of the cardinality_estimator.*pp
files should be done sparingly.
The loop at line 212 in this file (under ExtractJoinRelations
), for example, adds some key/value pairs to a map in CardinalityEstimator
. In those cases, the whole loop should probably be moved to the implementation file cardinality_estimator.cpp
, as this has nothing to do with dynamic programming.
If it's some kind of initialisation like in ExtractJoinRelations
, we prefer having a single call to some function in CardinalityEstimator
, rather than having a few lines here and there that do some stuff related to cardinality estimation in our functions that belong to our DP implementation.
vector<column_binding_set_t> equivalent_relations; | ||
vector<idx_t> equivalent_relations_tdom_no_hll; | ||
vector<idx_t> equivalent_relations_tdom_hll; | ||
unordered_map<idx_t, std::string> relation_to_table_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we merge relation_to_columns
and relation_to_table_name
?
For example:
struct MyTable { // not sure about name ...
string name;
unordered_set<idx_t> column_ids;
}
// now we can have one map instead of two
unordered_map<idx_t, MyTable> relation_to_table;
This makes it easier to add things later.
957a0bf
to
2e56c6c
Compare
eaa53ee
to
5783c01
Compare
sort(neighbors.begin(), neighbors.end()); | ||
for (auto neighbor : neighbors) { | ||
|
||
//! Neighbors should be reversed when iterating over them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean this up
return make_unique<JoinNode>(set, info, left, right, expected_cardinality, cost); | ||
|
||
// normal join, expect foreign key join | ||
JoinRelationSet *left_join_relations = left->set; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
push this into cardinality estimator
@@ -154,7 +198,15 @@ bool JoinOrderOptimizer::ExtractJoinRelations(LogicalOperator &input_op, vector< | |||
// base table scan, add to set of relations | |||
auto get = (LogicalGet *)op; | |||
auto relation = make_unique<SingleJoinRelation>(&input_op, parent); | |||
relation_mapping[get->table_index] = relations.size(); | |||
idx_t relation_id = relations.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
push into cardinality estimator
// when the amount of pairs gets too large we exit the dynamic programming and resort to a greedy algorithm | ||
// FIXME: simple heuristic currently | ||
// at 10K pairs stop searching exactly and switch to heuristic | ||
// at 20K pairs stop searching exactly and switch to heuristic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this comment back
@@ -371,6 +478,79 @@ bool JoinOrderOptimizer::SolveJoinOrderExactly() { | |||
return true; | |||
} | |||
|
|||
vector<unordered_set<idx_t>> JoinOrderOptimizer::AddGreaterSets(vector<unordered_set<idx_t>> current, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to something with supersets
return ret; | ||
} | ||
|
||
vector<unordered_set<idx_t>> JoinOrderOptimizer::GetAllNeighborSets(JoinRelationSet *new_set, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write a debug verification method
auto node = EmitPair(left, right, connection); | ||
|
||
// update the DP tree in case a future best connection uses an entry from the DP table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-word this one more time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add D_ASSERT()
in the generate Joins call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left some more nitpicky comments, nothing major. Looking better every day!
EstimatedProperties() : cardinality(0), cost(0) {}; | ||
unique_ptr<EstimatedProperties> Copy(); | ||
|
||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please separate methods and fields, and remove unused private:
block
estimated_props = make_unique<EstimatedProperties>(cardinality, cost); | ||
} | ||
|
||
static idx_t hash_table_col(idx_t table, idx_t col); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method can now be removed? Because you are using the column_binding_{map,set}_t
stuff?
Either way, this function and check_all_table_keys_forwarded()
should not be in snake_case.
} | ||
//! initialize equivalent relation tdom vector to have the same size as the | ||
//! equivalent relations. | ||
for (it = equivalent_relations.begin(); it != equivalent_relations.end(); it++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do with with a smart loop instead?
for (auto &equivalent_rel : equivalent_relations) {
// etc.
}
void CardinalityEstimator::InitTotalDomains() { | ||
vector<column_binding_set_t>::iterator it; | ||
//! erase empty equivalent relation sets | ||
for (it = equivalent_relations.begin(); it != equivalent_relations.end(); it++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do something like this here?
//! erase empty equivalent relation sets
std::remove_if(equivalent_relations.begin(), equivalent_relations.end(),
[](column_binding_set_t &equivalent_rel) {
return equivalent_rel.empty();
});
The ::iterator
stuff always seems so verbose to me
//! eri = equivalent relation index | ||
bool added_to_eri; | ||
|
||
vector<unordered_set<idx_t>>::iterator it; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unused
} | ||
|
||
//! Initialize the tdoms for all columns the relation uses in join conditions. | ||
unordered_set<idx_t>::iterator ite; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unused
NodeOp(unique_ptr<JoinNode> node, LogicalOperator *op) : node(move(node)), op(op) {}; | ||
}; | ||
|
||
static constexpr double DEFAULT_SELECTIVITY = 0.2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make DEFAULT_SELECTIVITY
a private field of `CardinalityEstimator?
…oing to add mult and sel on column levels now
…eries on just parquet
2e56c6c
to
054eedd
Compare
ad22156
to
debcdcb
Compare
… some reason they aren't reported in python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some very small nitpicks, and then I think we're ready to go!
src/optimizer/join_node.cpp
Outdated
@@ -0,0 +1,62 @@ | |||
// | |||
// Created by Tom Ebergen on 5/16/22. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the "Created by" header
@@ -0,0 +1,415 @@ | |||
# name: test/optimizer/join_parquet_with_base.test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test.
You don't need to explicitly write the TPC-H queries, we have PRAGMA's for that. Take a look at test/sql/tpch/tpch_sf01.test_slow
.
We have also stored all the answers to the TPC-H queries so you don't need to check the hash either, you can just compare it with a file.
Finally, please also change the extension of this to test_slow
, as this will be a slow test due to dbgen.
// The total domain was initialized to 0 (happens in a few test cases) | ||
return 1; | ||
} | ||
throw Exception("Could not get total domain of a join relations. Most likely a bug in InitTdoms"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use InternalException
for things that should not happen.
basically completed the new join order optimizer. Need to inspect a few tpcds queries though * added imdb_parquet benchmark * Adding multiplicities branch * add benchmark to query direct from query files * clean up code that is unnecessary * fix bugs in join ordering code * code clean up. This commit has code that changes how analyze structures are printed * small changes but nothing major * fix tpcds query error, but now there is one that times out * tpc-ds passes now * format fix * final fixes * fix tpcds benchmark again * fixed hopefully the last issues with tpcds and tpch and imdb * lawrence comments * format fix * forgot a D_ASSERT * make sure it can build * mult and sel looks good. Still needs cleanup and variable renaming. Going to add mult and sel on column levels now * first commit, have some ideas * now we are tracking at the column level, very nice * mults and sels are now tracked on a column level. stable for tpch, imdb not as much, going to do some memory switching * no more memory errors * getting there but we'll see. works if you init the left and right every time * it works on my machine, but I think there is something wrong with my hardware * things work, going to work on a big refactor now * refactor is working so far, but results aren't any better * remove some comments * format fix * some more smaller commits to clean up code * ok looks good, computing cardinality sums should work now * fix cmake stuff * comments and remove old code * this is going in the right direction * you can now see some of the join stats when you print the querygraph * should be able to see sels and muls in graph, but for some reason not yet * can now see selectivities and multiplicities on the query graph * some refactoring and cleaning, adding more metadata to query graph should be straightforward now * remove unnecessary header file * add cost to JoinStats * refactor looks much nicer now * more refactor * updated benchmark * cardinality estimation is a little bit better, still seg faulting on tpch query 5 * code to later add HLL * last commit before attemptin to implement new idea made with laurens * it works now. going to attempt big rebase * tracking uniqur values, debugging * works for the most part, except for a DP bug in q08a for the JOB tests * checkpoint, less segfaults, some laurens code as well * when applying a filter set a min resulting cardinality of at least 1 * fix rebase issues * properly update DP table * stopping coding to prove symmetry * ok looks good * ok here we have some really good results with very approximate joining. DP seems to still have an issue * we are defs on the right track here. Just implemented a way to check if a table filter is an equality comparison * good version going to do a clean up commit next * did some clean up and it builds * refactor, add or filter estimation logic * fixed non-reoderable joins bug * some more refactoring * some more refactoring * fix all benchmark files * fix benchmark files again * more refactoring * more refactoring * more refactoring and commentts * format fix refactor * remove two lines * big refactor. putting cardinality estimation logic in its own class * refactor looks good now. compiles and runs. Just need to fix laurens comments * using column_binding_t now, new copy function for estimated properties, unnesting some code * more refactoring * format-fix commit * removing dead code and other not necessary things * fixed a couple of issues regarding cartesian joins and the approximate join order solver. Still need to handle the case where there is more than one filter * allunit tests should pass now * format-fix * fix the std issue for finding the max element * fix more tests * format fix changes * debug and unittests pass * format-fix changes * more fixes and comments from lawrence * styling changes and also changes to make tpcds pass when using parquet * forgot to do a format fix * remove semi colon not found by format fix * try to get tests passing again, going to merge with master soon * fix rebase conflicts * format fix * more refactoring, need to look at some results because we regressed, but don't really want to do that now * make debug pass * should fix debug now * try to get more tests to pass. run analyze statement tests have better results as well * fix int/double error breaking some tests * everything looks good * fix divide by 0 * fix bug * make allunit passes now, still need to test make unittestci * add a parquet vs base table test. A test in select4.test_slow is still failing :( * fix last bug * fix compiling bugs * fix debug failing tests * ok, fixed bug where not all query edges are considered * fix issue where not all edges are considered, but results are worse * fix regressions that took me 3 hours to find * fix small bug involving idx_t to double casting, and fix executing queries on just parquet * fix the make tidy errors (hopefully) * remove not important files * some clang tidy stuff, but also some stuff for phase timings, but for some reason they aren't reported in python * deleted stuff for phase timings * tidy and format fixes * last comments * fix last test case Co-authored-by: Laurens Kuiper <laurens.kuiper@cwi.nl>
basically completed the new join order optimizer. Need to inspect a few tpcds queries though * added imdb_parquet benchmark * Adding multiplicities branch * add benchmark to query direct from query files * clean up code that is unnecessary * fix bugs in join ordering code * code clean up. This commit has code that changes how analyze structures are printed * small changes but nothing major * fix tpcds query error, but now there is one that times out * tpc-ds passes now * format fix * final fixes * fix tpcds benchmark again * fixed hopefully the last issues with tpcds and tpch and imdb * lawrence comments * format fix * forgot a D_ASSERT * make sure it can build * mult and sel looks good. Still needs cleanup and variable renaming. Going to add mult and sel on column levels now * first commit, have some ideas * now we are tracking at the column level, very nice * mults and sels are now tracked on a column level. stable for tpch, imdb not as much, going to do some memory switching * no more memory errors * getting there but we'll see. works if you init the left and right every time * it works on my machine, but I think there is something wrong with my hardware * things work, going to work on a big refactor now * refactor is working so far, but results aren't any better * remove some comments * format fix * some more smaller commits to clean up code * ok looks good, computing cardinality sums should work now * fix cmake stuff * comments and remove old code * this is going in the right direction * you can now see some of the join stats when you print the querygraph * should be able to see sels and muls in graph, but for some reason not yet * can now see selectivities and multiplicities on the query graph * some refactoring and cleaning, adding more metadata to query graph should be straightforward now * remove unnecessary header file * add cost to JoinStats * refactor looks much nicer now * more refactor * updated benchmark * cardinality estimation is a little bit better, still seg faulting on tpch query 5 * code to later add HLL * last commit before attemptin to implement new idea made with laurens * it works now. going to attempt big rebase * tracking uniqur values, debugging * works for the most part, except for a DP bug in q08a for the JOB tests * checkpoint, less segfaults, some laurens code as well * when applying a filter set a min resulting cardinality of at least 1 * fix rebase issues * properly update DP table * stopping coding to prove symmetry * ok looks good * ok here we have some really good results with very approximate joining. DP seems to still have an issue * we are defs on the right track here. Just implemented a way to check if a table filter is an equality comparison * good version going to do a clean up commit next * did some clean up and it builds * refactor, add or filter estimation logic * fixed non-reoderable joins bug * some more refactoring * some more refactoring * fix all benchmark files * fix benchmark files again * more refactoring * more refactoring * more refactoring and commentts * format fix refactor * remove two lines * big refactor. putting cardinality estimation logic in its own class * refactor looks good now. compiles and runs. Just need to fix laurens comments * using column_binding_t now, new copy function for estimated properties, unnesting some code * more refactoring * format-fix commit * removing dead code and other not necessary things * fixed a couple of issues regarding cartesian joins and the approximate join order solver. Still need to handle the case where there is more than one filter * allunit tests should pass now * format-fix * fix the std issue for finding the max element * fix more tests * format fix changes * debug and unittests pass * format-fix changes * more fixes and comments from lawrence * styling changes and also changes to make tpcds pass when using parquet * forgot to do a format fix * remove semi colon not found by format fix * try to get tests passing again, going to merge with master soon * fix rebase conflicts * format fix * more refactoring, need to look at some results because we regressed, but don't really want to do that now * make debug pass * should fix debug now * try to get more tests to pass. run analyze statement tests have better results as well * fix int/double error breaking some tests * everything looks good * fix divide by 0 * fix bug * make allunit passes now, still need to test make unittestci * add a parquet vs base table test. A test in select4.test_slow is still failing :( * fix last bug * fix compiling bugs * fix debug failing tests * ok, fixed bug where not all query edges are considered * fix issue where not all edges are considered, but results are worse * fix regressions that took me 3 hours to find * fix small bug involving idx_t to double casting, and fix executing queries on just parquet * fix the make tidy errors (hopefully) * remove not important files * some clang tidy stuff, but also some stuff for phase timings, but for some reason they aren't reported in python * deleted stuff for phase timings * tidy and format fixes * last comments * fix last test case Co-authored-by: Laurens Kuiper <laurens.kuiper@cwi.nl>
I was investigating the following crash where a checkpoint task had its underlying resources being destroyed while it was still running. The two interesting threads are the following: ``` thread #1, name = 'duckling', stop reason = signal SIGTRAP frame #0: 0x0000ffff91bb71ec frame #1: 0x0000aaaad73a38e8 duckling`duckdb::InternalException::InternalException(this=<unavailable>, msg=<unavailable>) at exception.cpp:336:2 frame #2: 0x0000aaaad786eb68 duckling`duckdb::unique_ptr<duckdb::RowGroup, std::default_delete<duckdb::RowGroup>, true>::operator*() const [inlined] duckdb::unique_ptr<duckdb::RowGroup, std::default_delete<duckdb::RowGroup>, true>::AssertNotNull(null=<unavailable>) at unique_ptr.hpp:25:10 frame #3: 0x0000aaaad786eaf4 duckling`duckdb::unique_ptr<duckdb::RowGroup, std::default_delete<duckdb::RowGroup>, true>::operator*(this=0x0000aaacbb73e008) const at unique_ptr.hpp:34:4 frame #4: 0x0000aaaad787abbc duckling`duckdb::CheckpointTask::ExecuteTask(this=0x0000aaabec92be60) at row_group_collection.cpp:732:21 frame #5: 0x0000aaaad7756ea4 duckling`duckdb::BaseExecutorTask::Execute(this=0x0000aaabec92be60, mode=<unavailable>) at task_executor.cpp:72:3 frame #6: 0x0000aaaad7757e28 duckling`duckdb::TaskScheduler::ExecuteForever(this=0x0000aaaafda30e10, marker=0x0000aaaafda164a8) at task_scheduler.cpp:189:32 frame #7: 0x0000ffff91a031fc frame #8: 0x0000ffff91c0d5c8 thread #120, stop reason = signal 0 frame #0: 0x0000ffff91c71c24 frame #1: 0x0000ffff91e1264c frame #2: 0x0000ffff91e01888 frame #3: 0x0000ffff91e018f8 frame #4: 0x0000ffff91e01c10 frame #5: 0x0000ffff91e05afc frame #6: 0x0000ffff91e05e70 frame #7: 0x0000aaaad784b63c duckling`duckdb::RowGroup::~RowGroup() [inlined] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(this=<unavailable>) at shared_ptr_base.h:184:10 frame #8: 0x0000aaaad784b5b4 duckling`duckdb::RowGroup::~RowGroup() [inlined] std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count(this=<unavailable>) at shared_ptr_base.h:705:11 frame #9: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] std::__shared_ptr<duckdb::ColumnData, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr(this=<unavailable>) at shared_ptr_base.h:1154:31 frame #10: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] duckdb::shared_ptr<duckdb::ColumnData, true>::~shared_ptr(this=<unavailable>) at shared_ptr_ipp.hpp:115:24 frame #11: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy<duckdb::shared_ptr<duckdb::ColumnData, true>>(__pointer=<unavailable>) at stl_construct.h:151:19 frame #12: 0x0000aaaad784b5ac duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy_aux<false>::__destroy<duckdb::shared_ptr<duckdb::ColumnData, true>*>(__first=<unavailable>, __last=<unavailable>) at stl_construct.h:163:6 frame #13: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy<duckdb::shared_ptr<duckdb::ColumnData, true>*>(__first=<unavailable>, __last=<unavailable>) at stl_construct.h:195:7 frame #14: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup() [inlined] void std::_Destroy<duckdb::shared_ptr<duckdb::ColumnData, true>*, duckdb::shared_ptr<duckdb::ColumnData, true>>(__first=<unavailable>, __last=<unavailable>, (null)=<unavailable>) at alloc_traits.h:848:7 frame #15: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup() [inlined] std::vector<duckdb::shared_ptr<duckdb::ColumnData, true>, std::allocator<duckdb::shared_ptr<duckdb::ColumnData, true>>>::~vector(this=<unavailable>) at stl_vector.h:680:2 frame #16: 0x0000aaaad784b5a0 duckling`duckdb::RowGroup::~RowGroup(this=<unavailable>) at row_group.cpp:83:1 frame #17: 0x0000aaaad786ee18 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] std::default_delete<duckdb::RowGroup>::operator()(this=0x0000aaacbb73e1a8, __ptr=0x0000aaab75ae7860) const at unique_ptr.h:85:2 frame #18: 0x0000aaaad786ee10 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() at unique_ptr.h:361:4 frame #19: 0x0000aaaad786ee08 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] duckdb::SegmentNode<duckdb::RowGroup>::~SegmentNode(this=0x0000aaacbb73e1a0) at segment_tree.hpp:21:8 frame #20: 0x0000aaaad786ee08 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy<duckdb::SegmentNode<duckdb::RowGroup>>(__pointer=0x0000aaacbb73e1a0) at stl_construct.h:151:19 frame #21: 0x0000aaaad786ee08 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy_aux<false>::__destroy<duckdb::SegmentNode<duckdb::RowGroup>*>(__first=0x0000aaacbb73e1a0, __last=0x0000aaacbb751130) at stl_construct.h:163:6 frame #22: 0x0000aaaad786ede8 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy<duckdb::SegmentNode<duckdb::RowGroup>*>(__first=<unavailable>, __last=0x0000aaacbb751130) at stl_construct.h:195:7 frame #23: 0x0000aaaad786ede8 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector() [inlined] void std::_Destroy<duckdb::SegmentNode<duckdb::RowGroup>*, duckdb::SegmentNode<duckdb::RowGroup>>(__first=<unavailable>, __last=0x0000aaacbb751130, (null)=0x0000fffefc81a908) at alloc_traits.h:848:7 frame #24: 0x0000aaaad786ede8 duckling`std::vector<duckdb::SegmentNode<duckdb::RowGroup>, std::allocator<duckdb::SegmentNode<duckdb::RowGroup>>>::~vector(this=size=4883) at stl_vector.h:680:2 frame #25: 0x0000aaaad7857f74 duckling`duckdb::RowGroupCollection::Checkpoint(this=<unavailable>, writer=<unavailable>, global_stats=0x0000fffefc81a9c0) at row_group_collection.cpp:1017:1 frame #26: 0x0000aaaad788f02c duckling`duckdb::DataTable::Checkpoint(this=0x0000aaab04649e70, writer=0x0000aaab6584ea80, serializer=0x0000fffefc81ab38) at data_table.cpp:1427:14 frame #27: 0x0000aaaad7881394 duckling`duckdb::SingleFileCheckpointWriter::WriteTable(this=0x0000fffefc81b128, table=0x0000aaab023b78c0, serializer=0x0000fffefc81ab38) at checkpoint_manager.cpp:528:11 frame #28: 0x0000aaaad787ece4 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() [inlined] duckdb::SingleFileCheckpointWriter::CreateCheckpoint(this=<unavailable>, obj=0x0000fffefc81ab38)::$_7::operator()(duckdb::Serializer::List&, unsigned long) const::'lambda'(duckdb::Serializer&)::operator()(duckdb::Serializer&) const at checkpoint_manager.cpp:181:43 frame #29: 0x0000aaaad787ecd8 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() [inlined] void duckdb::Serializer::List::WriteObject<duckdb::SingleFileCheckpointWriter::CreateCheckpoint()::$_7::operator()(duckdb::Serializer::List&, unsigned long) const::'lambda'(duckdb::Serializer&)>(this=<unavailable>, f=(unnamed class) @ 0x0000600002cbd2b0) at serializer.hpp:385:2 frame #30: 0x0000aaaad787ecc4 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() [inlined] duckdb::SingleFileCheckpointWriter::CreateCheckpoint()::$_7::operator()(this=<unavailable>, list=<unavailable>, i=2) const at checkpoint_manager.cpp:181:8 frame #31: 0x0000aaaad787ecb0 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint() at serializer.hpp:151:4 frame #32: 0x0000aaaad787ec94 duckling`duckdb::SingleFileCheckpointWriter::CreateCheckpoint(this=0x0000fffefc81b128) at checkpoint_manager.cpp:179:13 frame #33: 0x0000aaaad78954a8 duckling`duckdb::SingleFileStorageManager::CreateCheckpoint(this=0x0000aaaafe1de140, options=(wal_action = DONT_DELETE_WAL, action = CHECKPOINT_IF_REQUIRED, type = FULL_CHECKPOINT)) at storage_manager.cpp:365:17 frame #34: 0x0000aaaad78baac0 duckling`duckdb::DuckTransactionManager::Checkpoint(this=0x0000aaaafe167e00, context=<unavailable>, force=<unavailable>) at duck_transaction_manager.cpp:198:18 frame #35: 0x0000aaaad69d02c0 duckling`md::Server::BackgroundCheckpointIfNeeded(this=0x0000aaaafdbfe900) at server.cpp:1983:31 frame #36: 0x0000aaaadac5d3f0 duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] std::function<void ()>::operator()(this=<unavailable>) const at std_function.h:590:9 frame #37: 0x0000aaaadac5d3e0 duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] md::BackgroundCronTask::Start(unsigned long)::$_0::operator()(this=0x0000aaaafdf169a8) const at background_cron_task.cpp:25:4 frame #38: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] void std::__invoke_impl<void, md::BackgroundCronTask::Start(unsigned long)::$_0>((null)=<unavailable>, __f=0x0000aaaafdf169a8) at invoke.h:61:14 frame #39: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] std::__invoke_result<md::BackgroundCronTask::Start(unsigned long)::$_0>::type std::__invoke<md::BackgroundCronTask::Start(unsigned long)::$_0>(__fn=0x0000aaaafdf169a8) at invoke.h:96:14 frame #40: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] void std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>::_M_invoke<0ul>(this=0x0000aaaafdf169a8, (null)=<unavailable>) at std_thread.h:259:13 frame #41: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run() [inlined] std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>::operator()(this=0x0000aaaafdf169a8) at std_thread.h:266:11 frame #42: 0x0000aaaadac5d30c duckling`std::thread::_State_impl<std::thread::_Invoker<std::tuple<md::BackgroundCronTask::Start(unsigned long)::$_0>>>::_M_run(this=0x0000aaaafdf169a0) at std_thread.h:211:13 frame #43: 0x0000ffff91a031fc frame #44: 0x0000ffff91c0d5c8 ``` The problem is that if there's an IO exception being thrown in `RowGroupCollection::Checkpoint` after some (but not all) checkpoint tasks have been scheduled but before `checkpoint_state.executor.WorkOnTasks();` is called, it results in an InternalException / DuckDB crash as the `Checkpoint ` method does not wait for the scheduled tasks to have completed before destroying the referenced resources.
Fixes duckdblabs/duckdb-internal#3922 The failing query ```SQL SET order_by_non_integer_literal=true; SELECT DISTINCT ON ( 'string' ) 'string', GROUP BY CUBE ( 'string', ), 'string' IN ( SELECT 'string' ), HAVING 'string' IN ( SELECT 'string'); ``` The Plan generated before optimization is below. During optimization there is an attempt to convert the mark join into a semi. Before this conversion takes place, we usually check to make sure the mark join is not used in any operators above the mark join to prevent plan verification errors. Up until this point, only logical projections were checked for mark joins. Turns out this query is planned in such a way that the mark join is in one of the expressions of the aggregate operator. This was not checked, so the mark to semi conversion would take place. The fix is to modify the filter pushdown optimization so that it stores table indexes from logical aggregate operators. ``` ┌───────────────────────────┐ │ PROJECTION #1 │ │ ──────────────────── │ │ Expressions: #[2.0] │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ FILTER │ │ ──────────────────── │ │ Expressions: #[2.1] │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ AGGREGATE #2, #3, #4 │ │ ──────────────────── │ │ Groups: │ │ 'string' │ │ #[14.0] │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ COMPARISON_JOIN │ │ ──────────────────── │ │ Join Type: MARK │ │ ├──────────────┐ │ Conditions: │ │ │ ('string' = #[8.0]) │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ DUMMY_SCAN #0 ││ PROJECTION #8 │ │ ──────────────────── ││ ──────────────────── │ │ ││ Expressions: 'string' │ └───────────────────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ DUMMY_SCAN #7 │ │ ──────────────────── │ └───────────────────────────┘ ```
We had two users crash with the following backtrace: ``` frame #0: 0x0000ffffab2571ec frame #1: 0x0000aaaaac00c5fc duckling`duckdb::InternalException::InternalException(this=<unavailable>, msg=<unavailable>) at exception.cpp:328:2 frame #2: 0x0000aaaaac1ee418 duckling`duckdb::optional_ptr<duckdb::OptimisticDataWriter, true>::CheckValid(this=<unavailable>) const at optional_ptr.hpp:34:11 frame #3: 0x0000aaaaac1eea8c duckling`duckdb::MergeCollectionTask::Execute(duckdb::PhysicalBatchInsert const&, duckdb::ClientContext&, duckdb::GlobalSinkState&, duckdb::LocalSinkState&) [inlined] duckdb::optional_ptr<duckdb::OptimisticDataWriter, true>::operator*(this=<unavailable>) at optional_ptr.hpp:43:3 frame #4: 0x0000aaaaac1eea84 duckling`duckdb::MergeCollectionTask::Execute(this=0x0000aaaaf1b06150, op=<unavailable>, context=0x0000aaaba820d8d0, gstate_p=0x0000aaab06880f00, lstate_p=<unavailable>) at physical_batch_insert.cpp:219:90 frame #5: 0x0000aaaaac1d2e10 duckling`duckdb::PhysicalBatchInsert::Sink(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSinkInput&) const [inlined] duckdb::PhysicalBatchInsert::ExecuteTask(this=0x0000aaaafa62ab40, context=<unavailable>, gstate_p=0x0000aaab06880f00, lstate_p=0x0000aab12d442960) const at physical_batch_insert.cpp:425:8 frame #6: 0x0000aaaaac1d2dd8 duckling`duckdb::PhysicalBatchInsert::Sink(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSinkInput&) const [inlined] duckdb::PhysicalBatchInsert::ExecuteTasks(this=0x0000aaaafa62ab40, context=<unavailable>, gstate_p=0x0000aaab06880f00, lstate_p=0x0000aab12d442960) const at physical_batch_insert.cpp:431:9 frame #7: 0x0000aaaaac1d2dd8 duckling`duckdb::PhysicalBatchInsert::Sink(this=0x0000aaaafa62ab40, context=0x0000aab2fffd7cb0, chunk=<unavailable>, input=<unavailable>) const at physical_batch_insert.cpp:494:4 frame #8: 0x0000aaaaac353158 duckling`duckdb::PipelineExecutor::ExecutePushInternal(duckdb::DataChunk&, duckdb::ExecutionBudget&, unsigned long) [inlined] duckdb::PipelineExecutor::Sink(this=0x0000aab2fffd7c00, chunk=0x0000aab2fffd7d30, input=0x0000fffec0aba8d8) at pipeline_executor.cpp:521:24 frame #9: 0x0000aaaaac353130 duckling`duckdb::PipelineExecutor::ExecutePushInternal(this=0x0000aab2fffd7c00, input=0x0000aab2fffd7d30, chunk_budget=0x0000fffec0aba980, initial_idx=0) at pipeline_executor.cpp:332:23 frame #10: 0x0000aaaaac34f7b4 duckling`duckdb::PipelineExecutor::Execute(this=0x0000aab2fffd7c00, max_chunks=<unavailable>) at pipeline_executor.cpp:201:13 frame #11: 0x0000aaaaac34f258 duckling`duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) [inlined] duckdb::PipelineExecutor::Execute(this=<unavailable>) at pipeline_executor.cpp:278:9 frame #12: 0x0000aaaaac34f250 duckling`duckdb::PipelineTask::ExecuteTask(this=0x0000aab16dafd630, mode=<unavailable>) at pipeline.cpp:51:33 frame #13: 0x0000aaaaac348298 duckling`duckdb::ExecutorTask::Execute(this=0x0000aab16dafd630, mode=<unavailable>) at executor_task.cpp:49:11 frame #14: 0x0000aaaaac356600 duckling`duckdb::TaskScheduler::ExecuteForever(this=0x0000aaaaf0105560, marker=0x0000aaaaf00ee578) at task_scheduler.cpp:189:32 frame #15: 0x0000ffffab0a31fc frame #16: 0x0000ffffab2ad5c8 ``` Core dump analysis showed that the assertion `D_ASSERT(lstate.writer);` in `MergeCollectionTask::Execute` (i.e. it is crashing because `lstate.writer` is NULLPTR) was not satisfied when `PhysicalBatchInsert::Sink` was processing merge tasks from (other) pipeline executors. My suspicion is that this is only likely to happen for heavily concurrent workloads (applicable to the two users which crashed). The patch submitted as part of this PR has addressed the issue for these users.
I have seen this crashing due to invalid pointer on which a destructor is called, on last night `main` (`2ed9bf887f`) using unittester compiled from sources (clang 17) and extensions installed from default extension repository. Basically: ``` DYLD_INSERT_LIBRARIES=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/17/lib/darwin/libclang_rt.asan_osx_dynamic.dylib LOCAL_EXTENSION_REPO=http://extensions.duckdb.org ./build/release/test/unittest --autoloading all --skip-compiled --order rand test/parquet/test_parquet_schema.test ``` and seeing runtime sanitizer assertions such as ``` ==56046==ERROR: AddressSanitizer: container-overflow on address 0x6100000d4dcf at pc 0x000116c7f450 bp 0x00016fc1d170 sp 0x00016fc1d168 READ of size 1 at 0x6100000d4dcf thread T0 #0 0x000116c7f44c in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>* std::__1::__uninitialized_allocator_copy_impl[abi:ne190102]<std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+0x318 (parquet.duckdb_extension:arm64+0xab44c) #1 0x000116c7ec90 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__construct_at_end<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, unsigned long)+0x160 (parquet.duckdb_extension:arm64+0xaac90) #2 0x000116c7e7d8 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__assign_with_size[abi:ne190102]<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, long)+0x1e0 (parquet.duckdb_extension:arm64+0xaa7d8) #3 0x000116e8cd48 in duckdb::ParquetMultiFileInfo::BindReader(duckdb::ClientContext&, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileBindData&)+0xf18 (parquet.duckdb_extension:arm64+0x2b8d48) #4 0x000116e6e5fc in duckdb::MultiFileFunction<duckdb::ParquetMultiFileInfo>::MultiFileBindInternal(duckdb::ClientContext&, duckdb::unique_ptr<duckdb::MultiFileReader, std::__1::default_delete<duckdb::MultiFileReader>, true>, duckdb::shared_ptr<duckdb::MultiFileList, true>, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileOptions, duckdb::unique_ptr<duckdb::BaseFileReaderOptions, std::__1::default_delete<duckdb::BaseFileReaderOptions>, true>, duckdb::unique_ptr<duckdb::MultiFileReaderInterface, std::__1::default_delete<duckdb::MultiFileReaderInterface>, true>)+0x1210 (parquet.duckdb_extension:arm64+0x29a5fc) ``` or these failures while using ducklake ``` ==56079==ERROR: AddressSanitizer: container-overflow on address 0x616000091a78 at pc 0x0001323fc81c bp 0x00016bd0e890 sp 0x00016bd0e888 READ of size 8 at 0x616000091a78 thread T2049 #0 0x0001323fc818 in duckdb::MultiFileColumnDefinition::~MultiFileColumnDefinition()+0x258 (parquet.duckdb_extension:arm64+0x2a4818) #1 0x0001323fb588 in std::__1::vector<duckdb::MultiFileColumnDefinition, std::__1::allocator<duckdb::MultiFileColumnDefinition>>::__destroy_vector::operator()[abi:ne190102]()+0x98 (parquet.duckdb_extension:arm64+0x2a3588) #2 0x0001324a09e4 in duckdb::BaseFileReader::~BaseFileReader()+0x2bc (parquet.duckdb_extension:arm64+0x3489e4) #3 0x0001324a23ec in duckdb::ParquetReader::~ParquetReader()+0x22c (parquet.duckdb_extension:arm64+0x34a3ec) ```
I have seen this crashing due to invalid pointer on which a destructor is called, on last night `main` (`2ed9bf887f`) using unittester compiled from sources (clang 17) and extensions installed from default extension repository. Basically: ``` DYLD_INSERT_LIBRARIES=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/17/lib/darwin/libclang_rt.asan_osx_dynamic.dylib LOCAL_EXTENSION_REPO=http://extensions.duckdb.org ./build/release/test/unittest --autoloading all --skip-compiled --order rand test/parquet/test_parquet_schema.test ``` and seeing runtime sanitizer assertions such as ``` ==56046==ERROR: AddressSanitizer: container-overflow on address 0x6100000d4dcf at pc 0x000116c7f450 bp 0x00016fc1d170 sp 0x00016fc1d168 READ of size 1 at 0x6100000d4dcf thread T0 #0 0x000116c7f44c in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>* std::__1::__uninitialized_allocator_copy_impl[abi:ne190102]<std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+0x318 (parquet.duckdb_extension:arm64+0xab44c) #1 0x000116c7ec90 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__construct_at_end<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, unsigned long)+0x160 (parquet.duckdb_extension:arm64+0xaac90) #2 0x000116c7e7d8 in void std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__assign_with_size[abi:ne190102]<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, long)+0x1e0 (parquet.duckdb_extension:arm64+0xaa7d8) #3 0x000116e8cd48 in duckdb::ParquetMultiFileInfo::BindReader(duckdb::ClientContext&, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileBindData&)+0xf18 (parquet.duckdb_extension:arm64+0x2b8d48) #4 0x000116e6e5fc in duckdb::MultiFileFunction<duckdb::ParquetMultiFileInfo>::MultiFileBindInternal(duckdb::ClientContext&, duckdb::unique_ptr<duckdb::MultiFileReader, std::__1::default_delete<duckdb::MultiFileReader>, true>, duckdb::shared_ptr<duckdb::MultiFileList, true>, duckdb::vector<duckdb::LogicalType, true>&, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>&, duckdb::MultiFileOptions, duckdb::unique_ptr<duckdb::BaseFileReaderOptions, std::__1::default_delete<duckdb::BaseFileReaderOptions>, true>, duckdb::unique_ptr<duckdb::MultiFileReaderInterface, std::__1::default_delete<duckdb::MultiFileReaderInterface>, true>)+0x1210 (parquet.duckdb_extension:arm64+0x29a5fc) ``` or these failures while using ducklake ``` ==56079==ERROR: AddressSanitizer: container-overflow on address 0x616000091a78 at pc 0x0001323fc81c bp 0x00016bd0e890 sp 0x00016bd0e888 READ of size 8 at 0x616000091a78 thread T2049 #0 0x0001323fc818 in duckdb::MultiFileColumnDefinition::~MultiFileColumnDefinition()+0x258 (parquet.duckdb_extension:arm64+0x2a4818) #1 0x0001323fb588 in std::__1::vector<duckdb::MultiFileColumnDefinition, std::__1::allocator<duckdb::MultiFileColumnDefinition>>::__destroy_vector::operator()[abi:ne190102]()+0x98 (parquet.duckdb_extension:arm64+0x2a3588) #2 0x0001324a09e4 in duckdb::BaseFileReader::~BaseFileReader()+0x2bc (parquet.duckdb_extension:arm64+0x3489e4) #3 0x0001324a23ec in duckdb::ParquetReader::~ParquetReader()+0x22c (parquet.duckdb_extension:arm64+0x34a3ec) ``` With these changes, once having the `parquet` extension build by CI, this works as expected. I am not sure if the fix could / should be elsewhere.
This PR is the meat of the new Join Ordering Algorithm. Important functions to look at are
InitTdoms
andExtractJoinRelations
.