-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Use thread names in logs and deadlock diagnostics #13099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/threadnames.cpp
Outdated
|
||
std::unique_ptr<ThreadNameRegistry> g_thread_name_registry(new ThreadNameRegistry()); | ||
|
||
namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there somewhere more general or useful I should put the two functions in this namespace?
src/threadnames.cpp
Outdated
std::string process_name(name); | ||
|
||
/* | ||
* Uncomment if we want to retain the `bitcoin-` system thread name prefix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove or uncomment this block before merge.
957d3aa
to
fcfba12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
src/threadnames.cpp
Outdated
|
||
const std::string& ThreadNameRegistry::GetThreadName() | ||
{ | ||
auto thread_id = boost::this_thread::get_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should lock mutex.
src/threadnames.cpp
Outdated
} | ||
} | ||
|
||
void ThreadNameRegistry::RenameProcessThread(const char* name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a static function or a function in a anonymous namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument could be const std::string& name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be able to unittest the *Process*
functions eventually, so maybe I should just make them public.
src/threadnames.cpp
Outdated
#endif | ||
} | ||
|
||
std::string ThreadNameRegistry::GetProcessThreadName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a static function or a function in a anonymous namespace.
src/threadnames.cpp
Outdated
return found->second; | ||
} else { | ||
auto insert_ret = m_id_to_name.emplace(thread_id, GetProcessThreadName()); | ||
return insert_ret.first->second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shoud also add to m_name_to_id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks.
src/threadnames.h
Outdated
class ThreadNameRegistry | ||
{ | ||
public: | ||
const std::string& GetThreadName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return reference?
Thoughts on pulling the |
Concept ACK |
fcfba12
to
8379557
Compare
How does the use of boost::thread::id interact with eventual boost::thread removal? IIRC there was something about no guarantees being made that you can interact with boost::thread if the thread was started by std, though it usually works in practice. We do have a HAVE_THREAD_LOCAL defined, which should be set everywhere but OSX, so maybe we could just turn it on there if boost::thread::id is an issue? |
src/threadnames.h
Outdated
private: | ||
std::mutex m_map_lock; | ||
std::map<boost::thread::id, std::string> m_id_to_name; | ||
std::map<std::string, boost::thread::id> m_name_to_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of a nit, but maybe m_name_to_id
could instead be a map<string, int>
- where it just stores the occurrences of that name/prefix, then you don't have to call CountMapPrefix
, but can just update the count each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the look and great suggestion. I think maintaining a map of counts will also allow me to remove all of the utility functions in the anonymous namespace - nice catch!
Concept ACK- just a comment on how the maps are used. |
1d1ce9a
to
bddf05a
Compare
@TheBlueMatt good points, thanks for looking. I've pushed changes which
(jamesob/threadnames.1 -> jamesob/threadnames.3) This changeset is ready for review-for-merge IMO. |
@theuni What's the status of |
bddf05a
to
c065728
Compare
I've pushed a small change removing $ diff -uw <(git diff master..threadnames.3) <(git diff master..threadnames.4)
--- /proc/self/fd/12 2018-05-01 14:56:59.561313752 -0400
+++ /proc/self/fd/13 2018-05-01 14:56:59.549313568 -0400
@@ -2215,10 +2215,10 @@
void reset();
diff --git a/src/threadnames.cpp b/src/threadnames.cpp
new file mode 100644
-index 0000000..305c3b4
+index 0000000..92e3f75
--- /dev/null
+++ b/src/threadnames.cpp
-@@ -0,0 +1,135 @@
+@@ -0,0 +1,133 @@
+// Copyright (c) 2018 The Bitcoin Core developers
+// Distributed under the MIT software license, see the accompanying
+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
@@ -2345,8 +2345,6 @@
+
+#if defined(PR_GET_NAME)
+ ::prctl(PR_GET_NAME, pthreadname_buff);
-+#elif (defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__DragonFly__))
-+ pthread_get_name_np(pthread_self(), pthreadname_buff);
+#elif defined(MAC_OSX)
+ pthread_getname_np(pthread_self(), pthreadname_buff, sizeof(threadname_buff));
+#else |
75e9902
to
66fcbdc
Compare
After spending the day figuring out that |
66fcbdc
to
911f54e
Compare
Abstracts system thread name setting and tracks thread names in memory. Allows automatic thread numbering for reused names (e.g. http.0, http.1, ...). NB: as-written, this removes the `bitcoin-` prefix from child threads. Incorporates feedback from @conscott @TheBlueMatt @sipa.
911f54e
to
b09e25b
Compare
Per @sipa's advice, I'm now using |
If platform-specific functions are being used anyway, why not just use pthread_getspecific/pthread_setspecific, which are meant for exactly this? That also makes it a trivial transition when we're finally ready to use thread_local. |
@theuni what are you talking about replacing at this point? We're now platform agnostic with |
@jamesob See the top two commits here for a quick attempt: https://github.com/theuni/bitcoin/commits/thread_names That addresses a few things that make me hesitate about this PR:
I realize that my changes are oversimplified, it's just a quick POC for discussion. |
@theuni I think your simple implementation looks great. Very much agree with your concern re: map lookup and locking overhead, and I think there's a way to combine our approaches to avoid it. Let me know if this sounds right: IMO lock acquisition during the single If that sounds good to you, I'll cherry-pick your two commits onto this branch and reorganize accordingly. Otherwise if you think the numeric suffix isn't worthwhile, I'm happy to close this out and proceed on your commits only. |
@jamesob Looking again, I think handling the suffix at this layer is unnecessary and adds a significant amount of complication. Why not just add a suffix before passing the name into TraceThread() ? |
@theuni ugh, you're right, that's dumb. Let's go off of your commits. |
From IRC:
|
While trying to debug the apparent deadlock revealed in #12873, I realized it'd be nice to have the
POTENTIAL DEADLOCK DETECTED
message display which thread acquired each lock. I also think it'd be generally useful to have log lines display the name of the originating thread.This changeset does both of those things by introducing a class which manages thread naming,
ThreadNameRegistry
. The class abstracts process-control calls responsible for thread naming and provides automatic number suffixing to threads which use the same name (e.g.httpworker.0
,httpworker.1
, ...).With this patch, thread names look like this
and the debug log looks like this
Note that child thread names have changed; I'm no longer using the
bitcoin-
prefix. Because we're limited to 16 characters for thread name (on Linux, anyway), that prefix was causing the numeric suffix some names have to be hidden. If it's desirable to keep the prefix, I can revert this change.Implementation considerations
A basic version of this change could be done pretty trivially with something like
thread_local std::string threadname;
but per @theuni (#11722 (review)), we can't rely on having
thread_local
. Also note that with a basic implementation like that, we wouldn't be able to do the automatic numbering scheme to differentiate between threads with the same basename (e.g.httpworker
,scriptch
).We could also rely solely on thread-related system calls. I don't like this much either because of how poorly defined or unavailable
getname
is on some platforms, e.g. OS X, FreeBSD; not to mention the 16 character limit.Tests
If this gets a Concept ACK or two, I'll implement some.Unittests attached.