Skip to content

Conversation

HowHsu
Copy link

@HowHsu HowHsu commented Jul 25, 2025

In PackageTRUCChecks(), we calculate the in-package-parents for an in package tx, which results in the total time complexisity being O(n^2), n is the number of Txs in the package. Let's precompute the overall relationships between all transactions to make it be O(n).

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 25, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33062.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

No conflicts as of last run.

@fanquake fanquake requested a review from glozow July 25, 2025 12:22
/** Helper for PackageTRUCChecks: Returns a vector containing the indices of transactions (within
* package) that are direct parents of ptx. */
std::vector<size_t> FindInPackageParents(const Package& package, const CTransactionRef& ptx)
std::vector<std::vector<size_t>> BuildInPackageRelations(const Package& package)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider using DepGraph instead of a matrix of indices (it's basically the same thing but more compact)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider using DepGraph instead of a matrix of indices (it's basically the same thing but more compact)?

Sure, I'll look into it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glozow Hi glozow, seems DepGraph stores all ancestors of a tx, while here:

if (mempool_ancestors.size() + in_package_parents.size() + 1 > TRUC_ANCESTOR_LIMIT)

shows PackageTRUCChecks() want the parent (the direct ancestor).
Using DepGraph changes the semantics of this function. Any hints about the above code: Do we actually want mempool_ancestors.size() + in_package_parents.size() or mempool_ancestors.size + in_package_ancestors.size() here ?
Using the Bitset directly is an alternative way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw I don't think this needs to be direct parents, it's just implemented this way because the limit is 2.

As a more general note - thanks for working on this and I don't want to be discouraging, but I'm not sure that a marginal improvement to this small function is worth this much effort. I see you have some other PRs that might be more impactful, so maybe focus on those? Maybe we can come back to this in the future if it helps with other things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw I don't think this needs to be direct parents, it's just implemented this way because the limit is 2.

As a more general note - thanks for working on this and I don't want to be discouraging, but I'm not sure that a marginal improvement to this small function is worth this much effort. I see you have some other PRs that might be more impactful, so maybe focus on those? Maybe we can come back to this in the future if it helps with other things.

Sure, thanks.

In PackageTRUCChecks(), we calculate the in-package-parents for an
in package tx, which results in the total time complexisity being
O(n^2), n is the number of Txs in the package. Let's precompute the
overall relationships between all transactions to make it be O(n).

Signed-off-by: Hao Xu <hao.xu@linux.dev>
dr8931240-ux

This comment was marked as spam.

dr8931240-ux

This comment was marked as spam.

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this isn't a bottleneck, optimizing it without a benchmark or an easily reproducible usecase seems like it's a solution begging for a problem - I speak from personal experience, that's how I started as well :)

I left a few comments while reviewing anyway, but without a concept ack on the idea itself from those who are more familiar with it, it likely won't get merged.

if (&(*tx) == &(*ptx)) break;
if (possible_parents.count(tx->GetHash())) {
in_package_parents.push_back(i);
pindexes.clear();
Copy link
Contributor

@l0rinc l0rinc Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for reusing the set here? This will just create a dependency between the loops. If you have some benchmarks, it would be good to double-check, but this might be simpler (and maybe even faster):

Suggested change
pindexes.clear();
std::unordered_set<size_t> pindexes;
pindexes.reserve(tx->vin.size()); // TODO or whatever size you expect

}
std::copy(pindexes.begin(), pindexes.end(), std::back_inserter(relations[i]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could probably use the newer version here:

Suggested change
std::copy(pindexes.begin(), pindexes.end(), std::back_inserter(relations[i]));
std::ranges::copy(pindexes, std::back_inserter(relations[i]));

I haven't checked the usages, but why are we storing these in a vector in the first place if they have a uniqueness constraint?

Comment on lines +28 to +29
if (possible_parents.count(input.prevout.hash))
pindexes.insert(possible_parents[input.prevout.hash]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++20 allows ::contains (+ nit: braces):

Suggested change
if (possible_parents.count(input.prevout.hash))
pindexes.insert(possible_parents[input.prevout.hash]);
if (possible_parents.contains(input.prevout.hash)) {
pindexes.insert(possible_parents[input.prevout.hash]);
}

Or even better, to avoid double hashing (via existence check first, and getting the value, second), we might as well:

Suggested change
if (possible_parents.count(input.prevout.hash))
pindexes.insert(possible_parents[input.prevout.hash]);
if (const auto it{possible_parents.find(input.prevout.hash)}; it != possible_parents.end()) {
pindexes.insert(it->second);
}


/** Helper for PackageTRUCChecks: Returns a vector containing the indices of transactions (within
* package) that are direct parents of ptx. */
std::vector<size_t> FindInPackageParents(const size_t tx_index, const auto& relations)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't usually auto parameters and not sure why we're copying the result:

Suggested change
std::vector<size_t> FindInPackageParents(const size_t tx_index, const auto& relations)
inline const std::vector<size_t>& FindInPackageParents(size_t tx_index, const std::vector<std::vector<size_t>>& relations)
{
return relations[tx_index];
}

@HowHsu
Copy link
Author

HowHsu commented Aug 5, 2025

My understanding is that this isn't a bottleneck, optimizing it without a benchmark or an easily reproducible usecase seems like it's a solution begging for a problem - I speak from personal experience, that's how I started as well :)

I left a few comments while reviewing anyway, but without a concept ack on the idea itself from those who are more familiar with it, it likely won't get merged.

Thanks, l0rinc. I'll leave this patch alone for now.

@fanquake fanquake marked this pull request as draft August 6, 2025 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants