-
Notifications
You must be signed in to change notification settings - Fork 3.6k
feat: admin token recovery #26594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: admin token recovery #26594
Conversation
da17eff
to
d8203e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some comments in line. I think it looks good, but would be good to let Jamie take a look as well.
influxdb3_server/src/lib.rs
Outdated
common_state.trace_header_parser.clone(), | ||
Arc::new(http_metrics), | ||
common_state.trace_collector().clone(), | ||
TRACE_HTTP_SERVER_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this TRACE_HTTP_SERVER_NAME
should be parameterized to differentiate traces on the recovery server from traces on the main server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've parameterized it, see the specific changes here
let http_metrics = | ||
RequestMetrics::new(Arc::clone(&common_state.metrics), MetricFamily::HttpServer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the recovery server skip metrics, or even the tracing all together?
I think it still captures logs from the recovery server, so these may not be needed for something used sparingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did think about it, but given it's another HTTP server, any metrics we capture about this server could potentially be useful too is my thinking. I've just parameterized it so that you can differentiate (as per your suggestion above). Let me know if I'm missing something, I can take it out if that'd be a better choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would have to see how the request metrics from the main HTTP server mesh with those from the recovery server, since they are both using the same instance of metric::Registry
. As long as they don't conflict with each other than I don't think this is an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll run a quick check and see how it looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hiltontj - is there something specific to start the process to allow any http request to be sent to trace server? I'm using --traces-exporter-jaeger-agent-port 6831 --traces-exporter-jaeger-service-name influxdb3-core --traces-exporter jaeger
. I only see traffic going to jaeger when I run queries but all other requests don't seem to generate any traces. I checked the tcpdump
too, there's definitely nothing sent by influxdb3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be that we only use the jaeger hooks from the query path. I was thinking more of the prometheus metrics.
So, you could test by starting up the server, doing a regenerate with the recovery endpoint, and then hitting the main server's /metric
API to see if the metrics for the request to the recovery endpoint show up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm looks like /metrics
collects the details at the method
, path
level. So, it assumes a single http server. One thing is the same path is available in both servers, the recovery server allows it to be called without a password and the main server requires password to access the path. When looking at the /metrics
below it isn't immediately obvious whether it was the recovery server or the main server that responded. Having said that, I'm not sure if it entirely muddies the line either. @hiltontj - I can take it out if you think that'd be better.
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.005"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.01"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.05"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.25"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="2.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="10"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="inf"} 0
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.001"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.005"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.01"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.025"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.05"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.1"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.25"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.5"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="1"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="2.5"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="5"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="10"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="inf"} 1
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 0.003047878
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.001"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.005"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.01"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.05"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.25"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="2.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="10"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="inf"} 0
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.001"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.005"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.01"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.05"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.25"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="2.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="10"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="inf"} 0
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="aborted"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 1
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="1000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="10000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="100000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="1000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="10000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="inf"} 0
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="1000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="10000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="100000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="1000000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="10000000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="inf"} 1
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 319
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="1000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="10000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="100000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="1000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="10000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="inf"} 0
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="1000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="10000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="100000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="1000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="10000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="inf"} 0
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
influxdb3_catalog_operations_total{type="regenerate_admin_token"} 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is okay. As long as it doesn't break the /metrics
API, I think its fine.
6fb81a9
to
393c981
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, thank you for the PR and making the token regeneration experience better! :)
Security considerations
- User starts the server without passing in any ports to bind, in that case
0.0.0.0:8181
is used as default main server and127.0.0.1:8182
for the token recovery server.- In case of regeneration of admin token with admin token, the user can still use
0.0.0.0:8181
.- If the user has lost the admin token, then the user can jump into the container and then run recovery or regeneration commands as explained in manual tests section below.
- When ran inside docker, we default to
127.0.0.1
which means user cannot get to port8182
(it's notexpose
d in Dockerfile).- For users running in non-containerized environments, it's possible to run on any socket address (host + port) but care must be taken to not expose it accidentally which in turn can lead to unauthorized actors to recycle the admin token.
@jdstrand - please poke any holes in this security considerations I've taken into account. Also, I've used
--admin-token-recovery-http-bind
as it is similar to--http-bind
although this is wordy. I'm happy to switch it to--regenerate-endpoint-listen
if that's better option as mentioned in the issue comment or anything else that's more succinct.
I have blackbox tested this and for the most part it looks ok, but the default behavior is a problem (which I'll discuss why in a moment). I'm a bit confused on the intended behavior for listening on 127.0.0.1:8182
. The bulleted list of considerations strongly suggests that the intention is to allow listening on 127.0.0.1:8182
, but other statements suggest the default might only be for containers and that for non-containers that it is maybe optional, but to be careful.
That aside:
-
influxdb3 serve
unconditionally listens on127.0.0.1:8182
:$ influxdb3 serve --object-store=file --data-dir /tmp/data --node-id node0 ... $ ss -atunp|grep -E 'LISTEN .*influxdb3' tcp LISTEN 0 1024 127.0.0.1:8182 0.0.0.0:* users:(("influxdb3",pid=133872,fd=19)) tcp LISTEN 0 1024 0.0.0.0:8181 0.0.0.0:* users:(("influxdb3",pid=133872,fd=18))
-
While I was able to use
INFLUXDB3_ADMIN_TOKEN_RECOVERY_HTTP_BIND_ADDR
and--admin-token-recovery-http-bind
to change away from the default, I wasn't able to turn it off altogether. Eg:$ influxdb3 serve --object-store=file --data-dir /tmp/data --node-id node0 --admin-token-recovery-http-bind= error: invalid value '' for '--admin-token-recovery-http-bind <ADMIN_TOKEN_RECOVERY_BIND_ADDRESS>': Cannot parse socket address '': invalid socket address
-
serve --help
andserve --help-all
don't seem to mention--admin-token-recovery-http-bind
-
calling
create token --admin --regenerate --token "bad"
works, which is a weird look (presumably since it defaults to 127.0.0.1:8182 which doesn't need a token) -
noticed that
influxdb3 create token --admin --host ...
works butinfluxdb3 create token --admin -H ...
doesn't -
bikeshedding:
--admin-recovery-listen
is a bit less wordy yet still descriptive
In terms of the default behavior, I suspect this was optimized for the container case where it is presumed that network controls are in place to prevent access into the container. However, we must consider non-container environments, composed container environments (eg, docker compose with influxdb3_ui
, grafana
, etc alongside influxdb3
), processing engine plugins and managed environments.
Listening on 127.0.0.1:8182
by default is a security hole since a local user or process running on the system (including a processing engine plugin) can hit this port to change the token which breaks the operator token for others and grants the caller with full database privileges. This would qualify as a high to critical CVE if we released with this. This is not theoretical or hyperbolic: in the last couple of months alone I've triaged several issues in software we and our users use (python urllib3
, kafka client
and grafana
) that could be leveraged to recycle the token by attackers (not to mention if someone poorly coded a request
trigger). SSRFs and other security issues that allow connections to localhost are common and we need to cautiously design for this.
This behavior must be opt in (eg, server restart with non-default option) and the documentation of that behavior should have huge warnings. I don't know Amazon's architecture, but depending on how they handle the operator token, they might even ask that the functionality be entirely compiled out of the binary (we might want to do the same for InfluxData hosted Enterprise 3).
Considering all of the above, I suggest something along the lines of:
- disabling listening on
127.0.0.1:8182
unless--admin-recovery-listen
(or similar) is specified (ie, no listening on the recovery endpoint by default) - if
--admin-recovery-listen
is specified but without an argument, then listen on127.0.0.1:8182
- if
--admin-recovery-listen=...
is specified, use current behavior (ie, do what the user told you) - when
--admin-recovery-listen
is specified, after the user regenerates the token, stop listening on this endpoint
I believe this has the correct balance of usability, caution and security. If someone has forgotten their operator token, they need to take a deliberate step to restart the server with an option to enable the special recovery endpoint. Then, once the recover endpoint is used to regenerate a token, the server tears down the recovery endpoint (but keeps listening on the normal endpoint).
@jdstrand - thanks for the feedback 🙏 if I understood right, start this secondary server only when user opts in and then shutdown as soon as regeneration is done.
EDIT: I've got the Regarding containerised environments I was wondering what should be the default behaviour in terms of exposing ports in Dockerfile. I haven't exposed 8182, should it expose it by default given that influxdb3 will start and stop the server based on config and the first use. I need to also think through how to make it work only once in enterprise with multi node setup if all nodes are started with the flag as I'm guessing it'd be a global only once and not per node. |
Nice!
I'm inclined to think 'no'. Since the user needs to restart the server to add |
29dada9
to
e3a71f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I have a few more comments in line, none of which are blocking, but the one about the test and the one about RAII Guard are pretty straight forward, so up to you if you want to address them on this or a follow-on PR.
@@ -82,6 +82,9 @@ pub const DEFAULT_DATA_DIRECTORY_NAME: &str = ".influxdb3"; | |||
/// The default bind address for the HTTP API. | |||
pub const DEFAULT_HTTP_BIND_ADDR: &str = "0.0.0.0:8181"; | |||
|
|||
/// The default bind address for admin token recovery HTTP API. | |||
pub const DEFAULT_ADMIN_TOKEN_RECOVERY_BIND_ADDR: &str = "127.0.0.1:8182"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that this matters, but the default here uses 127.0.0.1
vs. 0.0.0.0
for the main bind address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intentional, we want it to default to only loop back address. It makes choosing to listen on any other address as an opt-in as this allows the clients with access to this interface regenerate an admin token without password.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that makes sense, thanks for clarifying.
} | ||
|
||
#[test_log::test(tokio::test)] | ||
async fn test_recovery_endpoint_auto_shutdown_after_regeneration() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test does not verify that the recovery endpoint is shutdown after the token is regenerated, but the name of the test implies that it does verify that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I thought that test had the assertion. I'll add the assertion again.
influxdb3_server/src/http.rs
Outdated
// Small delay to ensure HTTP response is fully sent | ||
tokio::time::sleep(Duration::from_millis(100)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could do this reliably without the use of a sleep. Something like a RAII guard.
My suggestion would be something like this:
struct Guard {
shutdown_token: ShutdownToken,
}
impl Guard {
fn new(shutdown_token: ShutdownToken) -> Self {
Self { shutdown_token }
}
}
impl Drop for Guard {
fn drop(&mut self) {
self.shutdown_token.cancel();
}
}
Then, in the above method, you could do something like
let guard = Guard::new(recovery_api.shutdown_token.clone());
// Extend the response with the guard so cancel is called when the response is dropped:
let mut builder = ResponseBuilder::new();
let mut extensions = builder.extensions_mut().unwrap();
extensions.insert(guard);
// finish building response and return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using builder extension is a really neat idea ❤️ - I'll change it.
let http_metrics = | ||
RequestMetrics::new(Arc::clone(&common_state.metrics), MetricFamily::HttpServer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be that we only use the jaeger hooks from the query path. I was thinking more of the prometheus metrics.
So, you could test by starting up the server, doing a regenerate with the recovery endpoint, and then hitting the main server's /metric
API to see if the metrics for the request to the recovery endpoint show up.
f89d91b
to
a4f3da2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! I love the tests and comments in the code.
I did some manual blackbox testing (not exhaustive as test coverage is very good):
- GOOD:
influxdb3 serve --object-store=file --data-dir /tmp/data --node-id node0
confirmed to only listen on the main server endpoint - GOOD: still requires auth when not using
--without-auth
- GOOD: does not require auth when using
--without-auth
- GOOD:
create token --admin
generates the initial operator token, and errors when try again - GOOD:
show databases
works with operator token and fails with no token/bad token - OK:
create token --admin --regenerate
correctly fails, but with a connection refused error when--admin-token-recovery-http-bind
is not specified. This is fine for security but is a usability issue: we should send back 401 - BUG:
create token --admin --token "$TOKEN" --regenerate
returns with connection refused when NOT using--admin-token-recovery-http-bind
. It seems that if--token
is specified, the client should use the main server API, not this recovery API - BUG:
create token --admin --token "bad" --regenerate
succeeds when using--admin-token-recovery-http-bind
. It seems that if--token
is specified, the client should use the main server API, not this recovery API - BUG:
create token --admin --token "$TOKEN" -H http://127.0.0.1:8181 --regenerate
fails because-H
is not accepted (but--host
is) - GOOD:
show databases --token "$TOKEN"
works with new token after regeneration via main server API and fails with old - GOOD: starting with
--admin-token-recovery-http-bind
starts the main server and the recovery endpoint on127.0.0.1:8182
- GOOD: starting with
--admin-token-recovery-http-bind=127.0.0.1:8183
starts the main server and the recovery endpoint on127.0.0.1:8183
- GOOD:
create token --admin --regenerate
correctly regenerates the token when server started with--admin-token-recovery-http-bind
- GOOD: the recovery endpoint shuts down after using
create token --admin --regenerate
- GOOD:
show databases --token "$TOKEN"
works with new token after regeneration via recovery API and fails with old
I'm approving as is, but it would be nice to fix the 3 non-security bugs (I think 2 can be fixed in the same way, unconditionally use the main server API when --token
is specified) and one usability issue. Those can be in this PR or a followup.
// Only create recovery listener if explicitly enabled | ||
let admin_token_recovery_listener = if let Some(addr) = config.admin_token_recovery_bind_address | ||
{ | ||
info!(%addr, "Admin token recovery endpoint enabled - WARNING: This allows unauthenticated admin token regeneration!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
@jdstrand - it looks up If this behavior is confusing, I can swap it such that when you pass in |
I do think the behavior is confusing. You could have the client decide based on the presence of |
@jdstrand - I'll swap it and ping you for another review. Thanks again for taking time to review it 🙏 |
@jdstrand - I've made the changes to address your 3 points,
As discussed above,
Help text now includes Manual Test ResultsTest Environment
1. Help Text Verification ✅1.1 Token Creation Help$ ./target/debug/influxdb3 create token --admin --help Result:
✅ VERIFIED: Help text correctly shows:
1.2 Server Help$ ./target/debug/influxdb3 serve --help-all | grep -A5 -B5 "admin-token-recovery" Result:
✅ VERIFIED: Recovery endpoint parameter is shown with proper warning 2. Server Without Recovery Endpoint ✅2.1 Start Server$ cargo run -- serve --node-id node-1 --object-store file --data-dir /home/praveen/projects/influx/test-data/core --disable-telemetry-upload --http-bind 127.0.0.1:8181 2.2 Test Regenerate with Default Settings$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate Result:
✅ VERIFIED: Correctly tries main server (8181) and gets 401 Unauthorized, NOT connection refused to 8182 3. Server With Default Recovery Endpoint ✅3.1 Start Server with Recovery Endpoint$ cargo run -- serve --node-id node-2 --object-store file --data-dir /home/praveen/projects/influx/test-data/core2 --disable-telemetry-upload --admin-token-recovery-http-bind --http-bind 127.0.0.1:8281 Server logs show:
3.2 Create Initial Admin Token$ ./target/debug/influxdb3 create token --admin --host http://127.0.0.1:8281 Result: Token created successfully 3.3 Test Regenerate Without Explicit Host$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8281 Result:
✅ VERIFIED: Tries main server, fails with 401 (not recovery endpoint) 3.4 Test Regenerate with Explicit Recovery Endpoint$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8182 Result: Successfully regenerated token Server logs show:
3.5 Verify Recovery Endpoint Shutdown$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8182 Result:
✅ VERIFIED: Recovery endpoint automatically shut down after successful regeneration 3.6 Test Regenerate on Main Server with Token$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8281 --token <NEW_TOKEN> Result: Successfully regenerated token on main server 4. Server With Custom Recovery Endpoint ✅4.1 Start Server with Custom Address$ cargo run -- serve --node-id node-3 --object-store file --data-dir /home/praveen/projects/influx/test-data/core3 --disable-telemetry-upload --admin-token-recovery-http-bind 0.0.0.0:9999 --http-bind 127.0.0.1:8381 Server logs show:
4.2 Create Admin Token$ ./target/debug/influxdb3 create token --admin --host http://127.0.0.1:8381 Result: Token created successfully 4.3 Test Default Regenerate (Should Fail)$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8381 Result:
✅ VERIFIED: Does NOT try default 8182, correctly tries main server 4.4 Test with Custom Recovery Endpoint$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:9999 Result: Successfully regenerated token 4.5 Verify Shutdown$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:9999 Result: Connection refused 5. Recovery Endpoint Isolation ✅5.1 Test Database Creation on Recovery Endpoint$ ./target/debug/influxdb3 create database --host http://127.0.0.1:8182 testdb Result:
5.2 Test Table Creation on Recovery Endpoint$ ./target/debug/influxdb3 create table --host http://127.0.0.1:8182 --database foo bar Result:
✅ VERIFIED: Recovery endpoint correctly returns 404 for all non-recovery operations 6. Edge Cases ✅6.1 Server Without Admin TokenStart server with auth disabled: $ cargo run -- serve --node-id node-5 --object-store file --data-dir /home/praveen/projects/influx/test-data/core5 --disable-telemetry-upload --admin-token-recovery-http-bind 127.0.0.1:8183 --http-bind 127.0.0.1:8581 --without-auth Try to regenerate: $ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8183 Result:
✅ VERIFIED: Correctly reports that there's no admin token to regenerate SummaryAll tests pass successfully! The key behavioral changes are verified:
The changes make the behavior more explicit and predictable, requiring users to consciously choose to use the less-secure recovery endpoint rather than having it as an automatic default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! I didn't do a full round of manual testing (I see that you did), but can confirm that it is behaving as expected:
- not listening on recovery endpoint when
--admin-token-recovery-http-bind
is not specified, listening on127.0.0.1:8182
when--admin-token-recovery-http-bind
is specified and listening elsewhere when--admin-token-recovery-http-bind
has an argument create token --admin --regenerate
(without -H) requires a token regardless of if--admin-token-recovery-http-bind
is specified (ie, it is hitting the main API)create token --admin --regenerate -H <recovery endpoint>
does not require a token and shuts down the endpoint after hitting it- token is properly regenerated in all cases
The usability issue, -H
bug and connection refused bug with --regenerate
when --admin-token-recovery-http-bind
is not specified are all addressed.
We still have the issue with this command: create token --admin --token "bad" --regenerate --host <recovery endpoint>
succeeding (because it is silently ignoring the --token
argument). Not a blocker.
#[clap( | ||
name = "regenerate", | ||
long = "regenerate", | ||
help = "Regenerate the operator token (uses port 8182 by default instead of 8181)" | ||
help = "Regenerate the operator token. By default connects to the main server (http://127.0.0.1:8181). | ||
To use the admin token recovery endpoint, specify --host with the recovery endpoint address" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
- new server to only serve admin token regeneration without an admin token has been added - minor refactors to allow reuse of some of the utilities like trace layer for metrics moved to their own functions to allow them to be instantiated for both servers - tests added to check if both the new server works right for regenerating token and also ensure none of the other functionalities are available on the admin token recovery server closes: #26330
- recovery server now only starts when `--admin-token-recovery-http-bind` is passed in - as soon as regeneration is done, the recovery server shuts itself down - the select! macro logic has been changed such that shutting down recovery server does not shutdown the main server
- when `--regenerate` is passed in, `--host` still defaults to the main server. To get to the recovery server, `--host` with the recovery server address should be passed in
2f78473
to
40102ae
Compare
Good point, I can make |
Perhaps it's sufficient to document that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
closes: #26330
Security considerations
The admin token recovery (using
--regenerate
) currently requires a valid admin token to be passed in. If that's lost for some reason there is no way to get back into the server other than switching back to running--without-auth
. This PR addresses it by assuming the following workflow,--admin-token-recovery-http-bind
or--admin-token-recovery-http-bind $IP:$PORT
. In either of those cases the recovery server will be started either on the explicit$IP:$PORT
or on default127.0.0.1:8182
when explicit ip/port is not specified.0.0.0.0:8181
(the main server).--admin-token-recovery-http-bind
argument. Note: this operation is only allowed once after the server restart. After successful regeneration the recovery server should shutdown immediately, but the main server will still be online.Manual Tests
Help sections show that
--regenerate
defaults to port127.0.0.1:8182
, so that it is not necessary to mention--regenerate --host https://127.0.0.1:8182
serve shows new parameter only in
--help-all
, it's listed underCommon Options
sectionStarting server without
--admin-token-recovery-http-bind
should have no recovery server running--regenerate
, it should failCreate admin token and see if it can be regenerated using default
127.0.0.1:8182
.start the server
try regenerating admin token
Create admin token and see if it can be regenerated using non-default
$host:$port
.create admin token
try regenerating admin token (note, when I started the server I passed in
--admin-token-recovery-http-bind 192.168.1.94:8181
so it expects the host to be passed in), default127.0.0.1:8182
it uses fails (todo: make these errors display easy to read messages than the full error)run
--regenerate
, this time passing in--host
run
--regenerate
again with--host http://192.168.1.94:8181
(it should fail as the recovery server is shutdown after successful regeneration)Other endpoints are not accessible through this server, create db, table etc. returns a
404