Skip to content

Conversation

mikulas-patocka
Copy link
Contributor

scoped_timer::finalize is called from fork. However, it may race with other threads creating or freeing timer threads.

This patch removes the loop in scoped_timer::finalize (because it is not needed and it may spin) and also removes two unlocked assignments.

The idle thread is added to "available_workers" in scoped_timer::~scoped_timer destructor.

If we call the "finalize" method as a part of total memory cleanup, all the scoped_timers' destructors were already executed and all the worker threads are already on "available_workers" vector. So, we don't need to loop; the first loop iteration will clean all the threads.

If the "finalize" method is called from single-threaded program's fork(), then all the scoped timers' destructors are already called and the case is analogous to the previous case.

If the "finalize" method is called from multi-threaded program's fork(), then it breaks down - the "num_workers" variable is the total amount of workers (both sleeping and busy), and we loop until we terminated "num_workers" threads - that means that if the number of sleeping workers is less than "num_workers", the function just spins.

Then, there is unlocked assignment to "num_workers = 0" and "available_workers.clear()" that can race with other threads doing z3 work and corrupt memory. available_workers.clear() is not needed, because it was already cleared by std::swap(available_workers, cleanup_workers) (and that was correctly locked).

scoped_timer::finalize is called from fork. However, it may race with
other threads creating or freeing timer threads.

This patch removes the loop in scoped_timer::finalize (because it is not
needed and it may spin) and also removes two unlocked assignments.

The idle thread is added to "available_workers" in
scoped_timer::~scoped_timer destructor.

If we call the "finalize" method as a part of total memory cleanup, all
the scoped_timers' destructors were already executed and all the worker
threads are already on "available_workers" vector. So, we don't need to
loop; the first loop iteration will clean all the threads.

If the "finalize" method is called from single-threaded program's fork(),
then all the scoped timers' destructors are already called and the case
is analogous to the previous case.

If the "finalize" method is called from multi-threaded program's fork(),
then it breaks down - the "num_workers" variable is the total amount of
workers (both sleeping and busy), and we loop until we terminated
"num_workers" threads - that means that if the number of sleeping workers
is less than "num_workers", the function just spins.

Then, there is unlocked assignment to "num_workers = 0" and
"available_workers.clear()" that can race with other threads doing z3
work and corrupt memory. available_workers.clear() is not needed, because
it was already cleared by std::swap(available_workers, cleanup_workers)
(and that was correctly locked).

Signed-off-by: Mikulas Patocka <mikulas@twibright.com>
@nunoplopes
Copy link
Collaborator

You didn't reply to my comment in the other PR.
If you have multiple threads running and you fork, finalize gets called and with this patch it won't delete those threads. They are going to get leaked or deadlocked.

So this patch doesn't seem safe.

@mikulas-patocka
Copy link
Contributor Author

You didn't reply to my comment in the other PR. If you have multiple threads running and you fork, finalize gets called and with this patch it won't delete those threads. They are going to get leaked or deadlocked.

So this patch doesn't seem safe.

If you have multiple threads running and you fork, it's impossible to do some z3 calculations in the child process. Some compute thread in the parent may be holding a mutex and if you fork, the thread is not replicated in the child, thus no one drops the mutex and the child deadlocks. It wasn't supported even without this patch, this patch doesn't make it worse.

In my use-case, I'm not trying to run z3 from the child process. I run multithreaded z3 in the parent process and when I fork, I just setup file handles and call exec in the child process. I want to make sure that the fork()+exec() won't damage z3 calculations running in the parent process - that's what I'm trying to fix with this patch.

@nunoplopes
Copy link
Collaborator

But the othread_atfork waits for the threads to finish before forking. Isnt' that sufficient to avoid deadlocks?

@mikulas-patocka
Copy link
Contributor Author

But the pthread_atfork waits for the threads to finish before forking. Isnt' that sufficient to avoid deadlocks?

It waits for timer threads to finish. Not for compute threads to finish. And the mutexes may be held by the compute threads too.

@nunoplopes
Copy link
Collaborator

This passed my performance & stress tests. Merging next.

@nunoplopes nunoplopes merged commit 6ecc7a2 into Z3Prover:master Apr 11, 2025
7 checks passed
arbipher pushed a commit to arbipher/z3 that referenced this pull request Apr 17, 2025
scoped_timer::finalize is called from fork. However, it may race with
other threads creating or freeing timer threads.

This patch removes the loop in scoped_timer::finalize (because it is not
needed and it may spin) and also removes two unlocked assignments.

The idle thread is added to "available_workers" in
scoped_timer::~scoped_timer destructor.

If we call the "finalize" method as a part of total memory cleanup, all
the scoped_timers' destructors were already executed and all the worker
threads are already on "available_workers" vector. So, we don't need to
loop; the first loop iteration will clean all the threads.

If the "finalize" method is called from single-threaded program's fork(),
then all the scoped timers' destructors are already called and the case
is analogous to the previous case.

If the "finalize" method is called from multi-threaded program's fork(),
then it breaks down - the "num_workers" variable is the total amount of
workers (both sleeping and busy), and we loop until we terminated
"num_workers" threads - that means that if the number of sleeping workers
is less than "num_workers", the function just spins.

Then, there is unlocked assignment to "num_workers = 0" and
"available_workers.clear()" that can race with other threads doing z3
work and corrupt memory. available_workers.clear() is not needed, because
it was already cleared by std::swap(available_workers, cleanup_workers)
(and that was correctly locked).

Signed-off-by: Mikulas Patocka <mikulas@twibright.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants