Skip to content

Conversation

furszy
Copy link
Member

@furszy furszy commented Jul 3, 2023

Fixing the current CI failures.

The tests objective is to exercise the chain synchronization code,
verifying that the indexes are able to catch up with the chain-tip.
Therefore, there is no need to run the process on a worker thread.

Note:
After #27607, the BaseIndex::Start method will be almost empty.
Only containing the ThreadSync call, allowing us to extract the
thread ownership from the index class and place it into an external
structure (probably, the thread pool introduced in #26966).

The tests objective is to exercise the chain synchronization
code, verifying that the indexes are able to catch up with the
chain-tip. Therefore, there is no need to run the process on
a worker thread.
@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 3, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK MarcoFalke, TheCharlatan

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28036 (test: Restore unlimited timeout in IndexWaitSynced by MarcoFalke)
  • #27607 (index: make startup more efficient by furszy)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented Jul 4, 2023

lgtm ACK f01cb3a

@fanquake fanquake requested review from ryanofsky and mzumsande July 4, 2023 09:11
Copy link
Contributor

@TheCharlatan TheCharlatan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f01cb3a

@maflcko
Copy link
Member

maflcko commented Jul 5, 2023

(An alternative, functionally equivalent to this, might be to just make the timeout infinite, see #27988 (comment), but I think this is fine as well.)

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(An alternative, functionally equivalent to this, might be to just make the timeout infinite, see #27988 (comment), but I think this is fine as well.)

I think I'd prefer that option because then we wouldn't need to add a test-only arg to BaseIndex::Start, plus having the same thread structure as in production seems more natural and more robust with respect to possible future changes of the init sequence and unit tests.

Copy link
Member Author

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(An alternative, functionally equivalent to this, might be to just make the timeout infinite, see #27988 (comment), but I think this is fine as well.)

I think I'd prefer that option because then we wouldn't need to add a test-only arg to BaseIndex::Start, plus having the same thread structure as in production seems more natural and more robust with respect to possible future changes of the init sequence and unit tests.

I agree that the current index initialization structure isn't ideal as it tightly couples the init sequence with the background sync thread creation.

We have #27607 addressing it, which decouples the init procedure from the BaseIndex::Start method. Leaving only the background sync thread creation there and nothing else.

Having that solved, and regarding the topic of matching production thread structure:
I don't see why unit tests in general should replicate the complete production behavior or precisely match the thread model used in production. Furthermore, none of these tests really concern themselves with concurrency; they simply start the index sync thread and wait for it to finish before proceeding, effectively running the task synchronously.

I mean, I'm also ok if we simply rise the timeout but, at the same time, I don't think that the background threads creation adds any meaningful test coverage here.


On a slightly unrelated note and going a bit further over the topic, I think that the app's threading decisions should not be part of the index class. In other words, threads should not be owned and managed by each index object. It is not a responsibility that indexes should have. Instead, indexes should sync purely based on external events.

And actually, by separating the threading concerns from the index class, we can optimize a high amount of redundant work currently performed by indexes. For instance, there is no need to traverse the entire chain and read every single block from disk in each of the index threads. We could have just one thread dedicated to reading blocks from disk, while an event dispatcher class notifies all existing indexes. Or.. even better, there could be multiple threads, reading different ranges of blocks at the same time, and dispatching events asynchronously to the indexes (this idea is related to #26966, and it's something I intend to pursue after #24230).

@maflcko
Copy link
Member

maflcko commented Jul 6, 2023

I mean, I'm also ok if we simply rise the timeout but, at the same time, I don't think that the background threads creation adds any meaningful test coverage here.

Right, it doesn't add any meaningful coverage. Though, I think the main point is that we should try to avoid adding test-only code to "real" code where possible. I am fine with anything, though.

(Only modifying the test code would also avoid the conflicting with the other pull?)

@maflcko
Copy link
Member

maflcko commented Jul 6, 2023

Went ahead and created the alternative: #28036, if reviewers want it. Happy to close it, if this is merged in the meantime.

@ajtowns
Copy link
Contributor

ajtowns commented Jul 6, 2023

I think I'd prefer that option because then we wouldn't need to add a test-only arg to BaseIndex::Start, plus having the same thread structure as in production seems more natural and more robust with respect to possible future changes of the init sequence and unit tests.

I think I (weakly) agree with this -- having the code be non-threaded in tests but threading in the real use seems like an annoying difference.

@maflcko
Copy link
Member

maflcko commented Jul 7, 2023

Does your feedback from #28036 (comment) also apply here? I think you'll also have to check for it being synced?

Assert(index.GetSummary().synced);

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 7, 2023

🐙 This pull request conflicts with the target branch and needs rebase.

@furszy
Copy link
Member Author

furszy commented Jul 7, 2023

Does your feedback from #28036 (comment) also apply here? I think you'll also have to check for it being synced?

Assert(index.GetSummary().synced);

Yeah @MarcoFalke. Same applies here. I didn't push it just because of #28036. Thought that we were going to tackle it there.

@maflcko
Copy link
Member

maflcko commented Jul 7, 2023

Looks like #28036 got merged in the meantime, so I'll add the Assert(!ShutdownRequested()) as follow-up on Monday unless someone beats me to it.

@furszy
Copy link
Member Author

furszy commented Jul 7, 2023

Looks like #28036 got merged in the meantime, so I'll add the Assert(!ShutdownRequested()) as follow-up on Monday unless someone beats me to it.

Yeah, not sure why it got merged it. Already pushed #28044 fixing it.

@maflcko
Copy link
Member

maflcko commented Jul 7, 2023

Thanks, I think this one can be closed for now?

@furszy furszy closed this Jul 7, 2023
@fanquake
Copy link
Member

fanquake commented Jul 7, 2023

Already pushed #28044 fixing it.

Why not just push it here, and re-use this PR (dropping any now-redundant changes) . Otherwise, why is this still open.

Yeah, not sure why it got merged it.

It got merged to fix the persistent (potentially confusing) CI issues. What you want to fix isn't currently an active issue.

@furszy
Copy link
Member Author

furszy commented Jul 7, 2023

Why not just push it here, and re-use this PR (dropping any now-redundant changes) . Otherwise, why is this still open.

Closed the PR a minute before your comment.

It got merged to fix the persistent (potentially confusing) CI issues. What you want to fix isn't currently an active issue.

Well, the previous CI failure was due a pretty clear timeout error. Now there is a possibility of tests hanging forever with no clear reason. So, this last one is practically more confusing than the first one.

Isn't a big deal anyway but would had being nicer if you would have waited a bit for Marco's feedback there. The timeout was open for few days, it could had continued open for few more hours.

@bitcoin bitcoin locked and limited conversation to collaborators Jul 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants