Skip to content

Conversation

praveen-influx
Copy link
Contributor

@praveen-influx praveen-influx commented Jul 8, 2025

Summary

  • new server to only serve admin token regeneration without an admin token has been added
  • minor refactors to allow reuse of some of the utilities like trace layer for metrics moved to their own functions to allow them to be instantiated for both servers
  • tests added to check if both the new server works right for regenerating token and also ensure none of the other functionalities are available on the admin token recovery server

closes: #26330

Security considerations

The admin token recovery (using --regenerate) currently requires a valid admin token to be passed in. If that's lost for some reason there is no way to get back into the server other than switching back to running --without-auth. This PR addresses it by assuming the following workflow,

  • User starts the server passing in --admin-token-recovery-http-bind or --admin-token-recovery-http-bind $IP:$PORT. In either of those cases the recovery server will be started either on the explicit $IP:$PORT or on default 127.0.0.1:8182 when explicit ip/port is not specified.
  • In case of regeneration of admin token with admin token, the user can still use 0.0.0.0:8181 (the main server).
  • When user does not have admin token, then the regeneration of admin token can be triggered on either the default ip/port or on the ip/port passed into --admin-token-recovery-http-bind argument. Note: this operation is only allowed once after the server restart. After successful regeneration the recovery server should shutdown immediately, but the main server will still be online.

Manual Tests

  • Help sections show that --regenerate defaults to port 127.0.0.1:8182, so that it is not necessary to mention --regenerate --host https://127.0.0.1:8182

    ❯ ./target/debug/influxdb3 create token --admin --help
    Create or regenerate an admin token
    
    Usage: influxdb3 create token --admin [OPTIONS]
    
    Options:
          --regenerate       Regenerate the operator token (uses port 8182 by default instead of 8181)
          --name <NAME>      Name of the token
          --expiry <EXPIRY>  Expires in `duration`, e.g 10d for 10 days 1y for 1 year
          --host <host>      The host URL of the running InfluxDB 3 Core server [env: INFLUXDB3_HOST_URL=]
          --token <token>    The token for authentication with the InfluxDB 3 Core server to create permissions. This will be the admin token to create tokens with permissions [env: INFLUXDB3_AUTH_TOKEN=]
          --tls-ca <tls-ca>  An optional arg to use a custom ca for useful for testing with self signed certs
          --format <FORMAT>  Output format for token, supports just json or text [possible values: json, text]
      -h, --help             Print help information
          --help-all         Print more detailed help information
    
    
  • serve shows new parameter only in --help-all, it's listed under Common Options section

      ❯ cargo run -- serve --help-all
        Common Options:
          --http-bind <ADDR>               Address for HTTP API requests [default: 0.0.0.0:8181]
                                          [env: INFLUXDB3_HTTP_BIND_ADDR=]
          --log-filter <FILTER>            Logs: filter directive [env: LOG_FILTER=]
          --tls-key <KEY_FILE>             The path to the key file for TLS to be enabled
                                          [env: INFLUXDB3_TLS_KEY=]
          --tls-cert <CERT_FILE>           The path to the cert file for TLS to be enabled
                                          [env: INFLUXDB3_TLS_CERT=]
          --tls-minimum-version <VERSION>  The minimum version for TLS. Valid values are
                                          tls-1.2 and tls-1.3, default is tls-1.2
                                          [env: INFLUXDB3_TLS_MINIMUM_VERSION=]
          --without-auth                   Run InfluxDB 3 server without authorization
          --disable-authz <RESOURCES>      Optionally disable authz by passing in a comma separated
                                          list of resources. Valid values are health, ping, and metrics.
                                          To disable auth for multiple resources pass in a list, eg.
                                          `--disable-authz health,ping`
          --admin-token-recovery-http-bind <ADDR>
                                          Address for HTTP API for admin token recovery requests [default: 127.0.0.1:8182]
                                          WARNING: This endpoint allows unauthenticated admin token regeneration - use with caution!
                                          [env: INFLUXDB3_ADMIN_TOKEN_RECOVERY_HTTP_BIND_ADDR]
    
    
  • Starting server without --admin-token-recovery-http-bind should have no recovery server running

      cargo run -- serve --node-id node-1 --object-store file --data-dir /home/praveen/projects/influx/test-data/core  --disable-telemetry-upload --http-bind 127.0.0.1:8181
    
    • try --regenerate, it should fail
        ❯ ./target/debug/influxdb3 create token --admin --regenerate
        Are you sure you want to regenerate admin token? Enter 'yes' to confirm
        yes
        Failed to create token, error: RequestSend { method: POST, url: "http://127.0.0.1:8182/api/v3/configure/token/admin/regenerate", source: reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8182), path: "/api/v3/configure/token/admin/regenerate", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) } }
      
  • Create admin token and see if it can be regenerated using default 127.0.0.1:8182.

    • start the server

        ❯ cargo run -- serve --node-id node-1 --object-store file --data-dir /home/praveen/projects/influx/test-data/core  --disable-telemetry-upload --admin-token-recovery-http-bind --http-bind 127.0.0.1:8181
      
    • try regenerating admin token

      • create admin token
          ❯ ./target/debug/influxdb3 create token --admin
        
          New token created successfully!
        
          Token: apiv3_VDD60uw8PbaEmE7a2flTnghRNxq4wCEJJs7lhHBza6FUdcr07gJmITfmXzKjVeilRx8vsPxdmgY2HzpUYWqW4g
          HTTP Requests Header: Authorization: Bearer apiv3_VDD60uw8PbaEmE7a2flTnghRNxq4wCEJJs7lhHBza6FUdcr07gJmITfmXzKjVeilRx8vsPxdmgY2HzpUYWqW4g
        
          IMPORTANT: Store this token securely, as it will not be shown again.
        
      • regenerate using default host and port
          ❯ ./target/debug/influxdb3 create token --admin --regenerate
          Are you sure you want to regenerate admin token? Enter 'yes' to confirm
          yes
        
          New token created successfully!
        
          Token: apiv3_TfRmlFixgJiszXySJ5i-vqOcdVgK6jSPSnWfV--7si7cutWTm0h_MSFN4JUHh0shDpo6eCCCyaQrh9TXk3ZA1g
          HTTP Requests Header: Authorization: Bearer apiv3_TfRmlFixgJiszXySJ5i-vqOcdVgK6jSPSnWfV--7si7cutWTm0h_MSFN4JUHh0shDpo6eCCCyaQrh9TXk3ZA1g
        
          IMPORTANT: Store this token securely, as it will not be shown again.
        
      • regenerate again should fail as the server has been shutdown
          ❯ ./target/debug/influxdb3 create token --admin --regenerate
          Are you sure you want to regenerate admin token? Enter 'yes' to confirm
          yes
          Failed to create token, error: RequestSend { method: POST, url: "http://127.0.0.1:8182/api/v3/configure/token/admin/regenerate", source: reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8182), path: "/api/v3/configure/token/admin/regenerate", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) } }
        
      • regenerate using the main http server without admin token, it should fail
          ❯ ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8181
          Are you sure you want to regenerate admin token? Enter 'yes' to confirm
          yes
          Failed to create token, error: ApiError { code: 401, message: "" }
        
        
      • regenerate admin token with token, it should work on the main http server
          ❯ ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8181 --token apiv3_TfRmlFixgJiszXySJ5i-vqOcdVgK6jSPSnWfV--7si7cutWTm0h_MSFN4JUHh0shDpo6eCCCyaQrh9TXk3ZA1g
          Are you sure you want to regenerate admin token? Enter 'yes' to confirm
          yes
        
          New token created successfully!
        
          Token: apiv3_LtXc15zm_5Le3bD01hBeo7vC0Zb_-rmCsvN2X6nYllwaAqeBQglDK1TqLwao9C-DLyknVH-BV-EV2RBsMRs13A
          HTTP Requests Header: Authorization: Bearer apiv3_LtXc15zm_5Le3bD01hBeo7vC0Zb_-rmCsvN2X6nYllwaAqeBQglDK1TqLwao9C-DLyknVH-BV-EV2RBsMRs13A
        
          IMPORTANT: Store this token securely, as it will not be shown again.
        
  • Create admin token and see if it can be regenerated using non-default $host:$port.

    • create admin token

        ❯ ./target/debug/influxdb3 create token --admin
      
        New token created successfully!
      
        Token: apiv3_65-K7ATO6sPtBu5L0C5RGitOxKmpZMinCgdUDfm5fPyRaxF5q_ldOgAp2-NqMGAMDiagbaRS5Mm4uw3mJ38OQg
        HTTP Requests Header: Authorization: Bearer apiv3_65-K7ATO6sPtBu5L0C5RGitOxKmpZMinCgdUDfm5fPyRaxF5q_ldOgAp2-NqMGAMDiagbaRS5Mm4uw3mJ38OQg
      
        IMPORTANT: Store this token securely, as it will not be shown again.
      
    • try regenerating admin token (note, when I started the server I passed in --admin-token-recovery-http-bind 192.168.1.94:8181 so it expects the host to be passed in), default 127.0.0.1:8182 it uses fails (todo: make these errors display easy to read messages than the full error)

        ❯ ./target/debug/influxdb3 create token --admin --regenerate
        Are you sure you want to regenerate admin token? Enter 'yes' to confirm
        yes
        Failed to create token, error: RequestSend { method: POST, url: "http://127.0.0.1:8182/api/v3/configure/token/admin/regenerate", source: reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8182), path: "/api/v3/configure/token/admin/regenerate", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) } }
      
    • run --regenerate, this time passing in --host

        ❯ ./target/debug/influxdb3 create token --admin --regenerate --host http://192.168.1.94:8181
      
        Are you sure you want to regenerate admin token? Enter 'yes' to confirm
        yes
      
        New token created successfully!
      
        Token: apiv3_Wyr7d-6rSXjT-peH7xkmY30HqjIjCShB9goXZ0eGdo1t_KgabehsnXE73V9LDY2peNVdwfDCeTHgPwXqE2ABVQ
        HTTP Requests Header: Authorization: Bearer apiv3_Wyr7d-6rSXjT-peH7xkmY30HqjIjCShB9goXZ0eGdo1t_KgabehsnXE73V9LDY2peNVdwfDCeTHgPwXqE2ABVQ
      
        IMPORTANT: Store this token securely, as it will not be shown again.
      
    • run --regenerate again with --host http://192.168.1.94:8181 (it should fail as the recovery server is shutdown after successful regeneration)

        ❯ ./target/debug/influxdb3 create token --admin --regenerate --host http://192.168.1.94:8181
      
        Are you sure you want to regenerate admin token? Enter 'yes' to confirm
        yes
        Failed to create token, error: RequestSend { method: POST, url: "http://192.168.1.94:8181/api/v3/configure/token/admin/regenerate", source: reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(192.168.1.94)), port: Some(8181), path: "/api/v3/configure/token/admin/regenerate", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) } }
      
  • Other endpoints are not accessible through this server, create db, table etc. returns a 404

    ❯ ./target/debug/influxdb3 create database --host http://127.0.0.1:8182 foo
    Create command failed: server responded with error [404 Not Found]: not found
    
    
    ❯ ./target/debug/influxdb3 create table --host http://127.0.0.1:8182 --database foo bar
    Create command failed: server responded with error [404 Not Found]: not found
    

@praveen-influx praveen-influx force-pushed the pk/admin-token-recovery branch 3 times, most recently from da17eff to d8203e4 Compare July 8, 2025 17:00
@praveen-influx praveen-influx marked this pull request as ready for review July 10, 2025 13:14
@praveen-influx praveen-influx requested review from a team and jdstrand July 10, 2025 13:16
Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments in line. I think it looks good, but would be good to let Jamie take a look as well.

common_state.trace_header_parser.clone(),
Arc::new(http_metrics),
common_state.trace_collector().clone(),
TRACE_HTTP_SERVER_NAME,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this TRACE_HTTP_SERVER_NAME should be parameterized to differentiate traces on the recovery server from traces on the main server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've parameterized it, see the specific changes here

Comment on lines +178 to +183
let http_metrics =
RequestMetrics::new(Arc::clone(&common_state.metrics), MetricFamily::HttpServer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the recovery server skip metrics, or even the tracing all together?

I think it still captures logs from the recovery server, so these may not be needed for something used sparingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about it, but given it's another HTTP server, any metrics we capture about this server could potentially be useful too is my thinking. I've just parameterized it so that you can differentiate (as per your suggestion above). Let me know if I'm missing something, I can take it out if that'd be a better choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to see how the request metrics from the main HTTP server mesh with those from the recovery server, since they are both using the same instance of metric::Registry. As long as they don't conflict with each other than I don't think this is an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll run a quick check and see how it looks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiltontj - is there something specific to start the process to allow any http request to be sent to trace server? I'm using --traces-exporter-jaeger-agent-port 6831 --traces-exporter-jaeger-service-name influxdb3-core --traces-exporter jaeger. I only see traffic going to jaeger when I run queries but all other requests don't seem to generate any traces. I checked the tcpdump too, there's definitely nothing sent by influxdb3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be that we only use the jaeger hooks from the query path. I was thinking more of the prometheus metrics.

So, you could test by starting up the server, doing a regenerate with the recovery endpoint, and then hitting the main server's /metric API to see if the metrics for the request to the recovery endpoint show up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm looks like /metrics collects the details at the method, path level. So, it assumes a single http server. One thing is the same path is available in both servers, the recovery server allows it to be called without a password and the main server requires password to access the path. When looking at the /metrics below it isn't immediately obvious whether it was the recovery server or the main server that responded. Having said that, I'm not sure if it entirely muddies the line either. @hiltontj - I can take it out if you think that'd be better.

http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.005"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.01"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.05"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.25"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="0.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="2.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="10"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="inf"} 0
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.001"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.005"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.01"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.025"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.05"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.1"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.25"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="0.5"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="1"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="2.5"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="5"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="10"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="inf"} 1
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 0.003047878
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 1
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.001"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.005"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.01"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.05"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.25"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="0.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="2.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="10"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="inf"} 0
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.001"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.0025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.005"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.01"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.025"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.05"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.25"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="0.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="1"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="2.5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="5"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="10"} 0
http_request_duration_seconds_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="inf"} 0
http_request_duration_seconds_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_request_duration_seconds_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="aborted"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 1
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_requests_total{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="1000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="10000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="100000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="1000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="10000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error",le="inf"} 0
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="client_error"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="1000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="10000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="100000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="1000000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="10000000"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok",le="inf"} 1
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 319
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="ok"} 1
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="1000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="10000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="100000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="1000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="10000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error",le="inf"} 0
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="server_error"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="100"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="1000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="10000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="100000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="1000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="10000000"} 0
http_response_body_size_bytes_bucket{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response",le="inf"} 0
http_response_body_size_bytes_sum{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
http_response_body_size_bytes_count{method="POST",method_path="POST /api/v3/configure/token/admin/regenerate",path="/api/v3/configure/token/admin/regenerate",status="unexpected_response"} 0
influxdb3_catalog_operations_total{type="regenerate_admin_token"} 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay. As long as it doesn't break the /metrics API, I think its fine.

@praveen-influx praveen-influx force-pushed the pk/admin-token-recovery branch from 6fb81a9 to 393c981 Compare July 11, 2025 11:43
@praveen-influx praveen-influx requested a review from hiltontj July 14, 2025 12:10
Copy link
Contributor

@jdstrand jdstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, thank you for the PR and making the token regeneration experience better! :)

Security considerations

  • User starts the server without passing in any ports to bind, in that case 0.0.0.0:8181 is used as default main server and 127.0.0.1:8182 for the token recovery server.
  • In case of regeneration of admin token with admin token, the user can still use 0.0.0.0:8181.
  • If the user has lost the admin token, then the user can jump into the container and then run recovery or regeneration commands as explained in manual tests section below.
    • When ran inside docker, we default to 127.0.0.1 which means user cannot get to port 8182 (it's not exposed in Dockerfile).
    • For users running in non-containerized environments, it's possible to run on any socket address (host + port) but care must be taken to not expose it accidentally which in turn can lead to unauthorized actors to recycle the admin token.

@jdstrand - please poke any holes in this security considerations I've taken into account. Also, I've used --admin-token-recovery-http-bind as it is similar to --http-bind although this is wordy. I'm happy to switch it to --regenerate-endpoint-listen if that's better option as mentioned in the issue comment or anything else that's more succinct.

I have blackbox tested this and for the most part it looks ok, but the default behavior is a problem (which I'll discuss why in a moment). I'm a bit confused on the intended behavior for listening on 127.0.0.1:8182. The bulleted list of considerations strongly suggests that the intention is to allow listening on 127.0.0.1:8182, but other statements suggest the default might only be for containers and that for non-containers that it is maybe optional, but to be careful.

That aside:

  • influxdb3 serve unconditionally listens on 127.0.0.1:8182:

    $ influxdb3 serve --object-store=file --data-dir /tmp/data --node-id node0
    ...
    $ ss -atunp|grep -E 'LISTEN .*influxdb3'
    tcp    LISTEN  0       1024             127.0.0.1:8182           0.0.0.0:*       users:(("influxdb3",pid=133872,fd=19))
    tcp    LISTEN  0       1024               0.0.0.0:8181           0.0.0.0:*       users:(("influxdb3",pid=133872,fd=18))
    
  • While I was able to use INFLUXDB3_ADMIN_TOKEN_RECOVERY_HTTP_BIND_ADDR and --admin-token-recovery-http-bind to change away from the default, I wasn't able to turn it off altogether. Eg:

    $ influxdb3 serve --object-store=file --data-dir /tmp/data --node-id node0 --admin-token-recovery-http-bind=
    error: invalid value '' for '--admin-token-recovery-http-bind <ADMIN_TOKEN_RECOVERY_BIND_ADDRESS>': Cannot parse socket address '': invalid socket address
    
  • serve --help and serve --help-all don't seem to mention --admin-token-recovery-http-bind

  • calling create token --admin --regenerate --token "bad" works, which is a weird look (presumably since it defaults to 127.0.0.1:8182 which doesn't need a token)

  • noticed that influxdb3 create token --admin --host ... works but influxdb3 create token --admin -H ... doesn't

  • bikeshedding: --admin-recovery-listen is a bit less wordy yet still descriptive

In terms of the default behavior, I suspect this was optimized for the container case where it is presumed that network controls are in place to prevent access into the container. However, we must consider non-container environments, composed container environments (eg, docker compose with influxdb3_ui, grafana, etc alongside influxdb3), processing engine plugins and managed environments.

Listening on 127.0.0.1:8182 by default is a security hole since a local user or process running on the system (including a processing engine plugin) can hit this port to change the token which breaks the operator token for others and grants the caller with full database privileges. This would qualify as a high to critical CVE if we released with this. This is not theoretical or hyperbolic: in the last couple of months alone I've triaged several issues in software we and our users use (python urllib3, kafka client and grafana) that could be leveraged to recycle the token by attackers (not to mention if someone poorly coded a request trigger). SSRFs and other security issues that allow connections to localhost are common and we need to cautiously design for this.

This behavior must be opt in (eg, server restart with non-default option) and the documentation of that behavior should have huge warnings. I don't know Amazon's architecture, but depending on how they handle the operator token, they might even ask that the functionality be entirely compiled out of the binary (we might want to do the same for InfluxData hosted Enterprise 3).

Considering all of the above, I suggest something along the lines of:

  • disabling listening on 127.0.0.1:8182 unless --admin-recovery-listen (or similar) is specified (ie, no listening on the recovery endpoint by default)
  • if --admin-recovery-listen is specified but without an argument, then listen on 127.0.0.1:8182
  • if --admin-recovery-listen=... is specified, use current behavior (ie, do what the user told you)
  • when --admin-recovery-listen is specified, after the user regenerates the token, stop listening on this endpoint

I believe this has the correct balance of usability, caution and security. If someone has forgotten their operator token, they need to take a deliberate step to restart the server with an option to enable the special recovery endpoint. Then, once the recover endpoint is used to regenerate a token, the server tears down the recovery endpoint (but keeps listening on the normal endpoint).

@praveen-influx
Copy link
Contributor Author

praveen-influx commented Jul 15, 2025

@jdstrand - thanks for the feedback 🙏 if I understood right, start this secondary server only when user opts in and then shutdown as soon as regeneration is done. Would it be ok to separate the arguments out as,

1. --with-admin-token-recovery, that will start the server.
2. --admin-token-recovery-http-bind, that will take the address/port to listen on.

If (2) isn't passed in, then it'll bind on 127.0.0.1:8182 by default.

I'll try to see if I can get it to work without the extra flag but wanted to check if splitting it to extra flag is a viable option as I'm not sure if we use a single argument as a flag and also as an argument that accepts a value somewhere else.

EDIT: I've got the --admin-token-recovery-http-bind to work as both a flag and also as an argument that accepts a value. So, my suggestion above is irrelevant.

Regarding containerised environments I was wondering what should be the default behaviour in terms of exposing ports in Dockerfile. I haven't exposed 8182, should it expose it by default given that influxdb3 will start and stop the server based on config and the first use.

I need to also think through how to make it work only once in enterprise with multi node setup if all nodes are started with the flag as I'm guessing it'd be a global only once and not per node.

@jdstrand
Copy link
Contributor

EDIT: I've got the --admin-token-recovery-http-bind to work as both a flag and also as an argument that accepts a value. So, my suggestion above is irrelevant.

Nice!

Regarding containerised environments I was wondering what should be the default behaviour in terms of exposing ports in Dockerfile. I haven't exposed 8182, should it expose it by default given that influxdb3 will start and stop the server based on config and the first use.

I'm inclined to think 'no'. Since the user needs to restart the server to add --admin-token-recovery-http-bind, they can expose the port at that time (the documentation can cover this). In this manner, we are erring on the side of security.

@praveen-influx praveen-influx force-pushed the pk/admin-token-recovery branch from 29dada9 to e3a71f3 Compare July 16, 2025 13:20
Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I have a few more comments in line, none of which are blocking, but the one about the test and the one about RAII Guard are pretty straight forward, so up to you if you want to address them on this or a follow-on PR.

@@ -82,6 +82,9 @@ pub const DEFAULT_DATA_DIRECTORY_NAME: &str = ".influxdb3";
/// The default bind address for the HTTP API.
pub const DEFAULT_HTTP_BIND_ADDR: &str = "0.0.0.0:8181";

/// The default bind address for admin token recovery HTTP API.
pub const DEFAULT_ADMIN_TOKEN_RECOVERY_BIND_ADDR: &str = "127.0.0.1:8182";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this matters, but the default here uses 127.0.0.1 vs. 0.0.0.0 for the main bind address

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional, we want it to default to only loop back address. It makes choosing to listen on any other address as an opt-in as this allows the clients with access to this interface regenerate an admin token without password.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that makes sense, thanks for clarifying.

}

#[test_log::test(tokio::test)]
async fn test_recovery_endpoint_auto_shutdown_after_regeneration() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not verify that the recovery endpoint is shutdown after the token is regenerated, but the name of the test implies that it does verify that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought that test had the assertion. I'll add the assertion again.

Comment on lines 1896 to 1897
// Small delay to ensure HTTP response is fully sent
tokio::time::sleep(Duration::from_millis(100)).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could do this reliably without the use of a sleep. Something like a RAII guard.

My suggestion would be something like this:

struct Guard {
    shutdown_token: ShutdownToken,
}

impl Guard {
    fn new(shutdown_token: ShutdownToken) -> Self {
        Self { shutdown_token }
    }
}

impl Drop for Guard {
    fn drop(&mut self) {
        self.shutdown_token.cancel();
    }
}

Then, in the above method, you could do something like

let guard = Guard::new(recovery_api.shutdown_token.clone());

// Extend the response with the guard so cancel is called when the response is dropped:
let mut builder = ResponseBuilder::new();
let mut extensions = builder.extensions_mut().unwrap();
extensions.insert(guard);
// finish building response and return

Copy link
Contributor Author

@praveen-influx praveen-influx Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using builder extension is a really neat idea ❤️ - I'll change it.

Comment on lines +178 to +183
let http_metrics =
RequestMetrics::new(Arc::clone(&common_state.metrics), MetricFamily::HttpServer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be that we only use the jaeger hooks from the query path. I was thinking more of the prometheus metrics.

So, you could test by starting up the server, doing a regenerate with the recovery endpoint, and then hitting the main server's /metric API to see if the metrics for the request to the recovery endpoint show up.

@praveen-influx praveen-influx force-pushed the pk/admin-token-recovery branch 3 times, most recently from f89d91b to a4f3da2 Compare July 17, 2025 11:30
Copy link
Contributor

@jdstrand jdstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! I love the tests and comments in the code.

I did some manual blackbox testing (not exhaustive as test coverage is very good):

  • GOOD: influxdb3 serve --object-store=file --data-dir /tmp/data --node-id node0 confirmed to only listen on the main server endpoint
  • GOOD: still requires auth when not using --without-auth
  • GOOD: does not require auth when using --without-auth
  • GOOD: create token --admin generates the initial operator token, and errors when try again
  • GOOD: show databases works with operator token and fails with no token/bad token
  • OK: create token --admin --regenerate correctly fails, but with a connection refused error when --admin-token-recovery-http-bind is not specified. This is fine for security but is a usability issue: we should send back 401
  • BUG: create token --admin --token "$TOKEN" --regenerate returns with connection refused when NOT using --admin-token-recovery-http-bind. It seems that if --token is specified, the client should use the main server API, not this recovery API
  • BUG: create token --admin --token "bad" --regenerate succeeds when using --admin-token-recovery-http-bind. It seems that if --token is specified, the client should use the main server API, not this recovery API
  • BUG: create token --admin --token "$TOKEN" -H http://127.0.0.1:8181 --regenerate fails because -H is not accepted (but --host is)
  • GOOD: show databases --token "$TOKEN" works with new token after regeneration via main server API and fails with old
  • GOOD: starting with --admin-token-recovery-http-bind starts the main server and the recovery endpoint on 127.0.0.1:8182
  • GOOD: starting with --admin-token-recovery-http-bind=127.0.0.1:8183 starts the main server and the recovery endpoint on 127.0.0.1:8183
  • GOOD: create token --admin --regenerate correctly regenerates the token when server started with --admin-token-recovery-http-bind
  • GOOD: the recovery endpoint shuts down after using create token --admin --regenerate
  • GOOD: show databases --token "$TOKEN" works with new token after regeneration via recovery API and fails with old

I'm approving as is, but it would be nice to fix the 3 non-security bugs (I think 2 can be fixed in the same way, unconditionally use the main server API when --token is specified) and one usability issue. Those can be in this PR or a followup.

// Only create recovery listener if explicitly enabled
let admin_token_recovery_listener = if let Some(addr) = config.admin_token_recovery_bind_address
{
info!(%addr, "Admin token recovery endpoint enabled - WARNING: This allows unauthenticated admin token regeneration!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@praveen-influx
Copy link
Contributor Author

create token --admin --token "$TOKEN" --regenerate returns with connection refused when NOT using --admin-token-recovery-http-bind. It seems that if --token is specified, the client should use the main server API, not this recovery API

@jdstrand - it looks up --regenerate at the moment to see what it should default to. Because the client isn't aware that server is started with --admin-token-recovery-http-bind or not whenever --regenerate is passed in it defaults to the recovery server (even if the recovery server itself isn't started). I tried to highlight it in the client (i.e influxdb3 create token --admin --help) message ( --regenerate Regenerate the operator token (uses port 8182 by default instead of 8181).

If this behavior is confusing, I can swap it such that when you pass in --regenerate, you'd always have to set the host to go to the recovery server, otherwise it'd default to the main server and NOT the recovery server. Will that be a better experience?

@jdstrand
Copy link
Contributor

create token --admin --token "$TOKEN" --regenerate returns with connection refused when NOT using --admin-token-recovery-http-bind. It seems that if --token is specified, the client should use the main server API, not this recovery API

@jdstrand - it looks up --regenerate at the moment to see what it should default to. Because the client isn't aware that server is started with --admin-token-recovery-http-bind or not whenever --regenerate is passed in it defaults to the recovery server (even if the recovery server itself isn't started). I tried to highlight it in the client (i.e influxdb3 create token --admin --help) message ( --regenerate Regenerate the operator token (uses port 8182 by default instead of 8181).

If this behavior is confusing, I can swap it such that when you pass in --regenerate, you'd always have to set the host to go to the recovery server, otherwise it'd default to the main server and NOT the recovery server. Will that be a better experience?

I do think the behavior is confusing. You could have the client decide based on the presence of --token or not (if present, use main, if not, use recovery). However, I think your idea with swapping is very clear for users and a very deliberate choice (much like starting it with the option is) which removes magic from the equation on this security-sensitive operation.

@praveen-influx
Copy link
Contributor Author

@jdstrand - I'll swap it and ping you for another review. Thanks again for taking time to review it 🙏

@praveen-influx
Copy link
Contributor Author

@jdstrand - I've made the changes to address your 3 points,

BUG: create token --admin --token "$TOKEN" --regenerate returns with connection refused when NOT using --admin-token-recovery-http-bind. It seems that if --token is specified, the client should use the main server API, not this recovery API
BUG: create token --admin --token "bad" --regenerate succeeds when using --admin-token-recovery-http-bind. It seems that if --token is specified, the client should use the main server API, not this recovery API

As discussed above, --regenerate defaults to main server unless --host is specified and the main server only does the regeneration if admin token is used

BUG: create token --admin --token "$TOKEN" -H http://127.0.0.1:8181 --regenerate fails because -H is not accepted (but --host is)

Help text now includes -H option. Below are another round of tests.

Manual Test Results

Test Environment

  • Binary: ./target/debug/influxdb3
  • Test data directories: /home/praveen/projects/influx/test-data/core*
  • Date: 2025-07-18

1. Help Text Verification ✅

1.1 Token Creation Help

$ ./target/debug/influxdb3 create token --admin --help

Result:

Create or regenerate an admin token

Usage: influxdb3 create token --admin [OPTIONS]

Options:
      --regenerate       Regenerate the operator token. By default connects to the main server (http://127.0.0.1:8181).
                                         To use the admin token recovery endpoint, specify --host with the recovery endpoint address
      --name <NAME>      Name of the token
      --expiry <EXPIRY>  Expires in `duration`, e.g 10d for 10 days 1y for 1 year
  -H, --host <host>      The host URL of the running InfluxDB 3 Core server [env: INFLUXDB3_HOST_URL=] [default: http://127.0.0.1:8181]
      --token <token>    The token for authentication with the InfluxDB 3 Core server to create permissions. This will be the admin token to create tokens with permissions [env: INFLUXDB3_AUTH_TOKEN=]
      --tls-ca <tls-ca>  An optional arg to use a custom ca for useful for testing with self signed certs
      --format <FORMAT>  Output format for token, supports just json or text [possible values: json, text]
  -h, --help             Print help information
      --help-all         Print more detailed help information

✅ VERIFIED: Help text correctly shows:

  • --regenerate defaults to main server (http://127.0.0.1:8181)
  • Instructions to use recovery endpoint with explicit --host
  • No mention of automatic port 8182 behavior

1.2 Server Help

$ ./target/debug/influxdb3 serve --help-all | grep -A5 -B5 "admin-token-recovery"

Result:

  --without-auth                   Run InfluxDB 3 server without authorization
  --disable-authz <RESOURCES>      Optionally disable authz by passing in a comma separated
                                   list of resources. Valid values are health, ping, and metrics.
                                   To disable auth for multiple resources pass in a list, eg.
                                   `--disable-authz health,ping`
  --admin-token-recovery-http-bind <ADDR>
                                   Address for HTTP API for admin token recovery requests [default: 127.0.0.1:8182]
                                   WARNING: This endpoint allows unauthenticated admin token regeneration - use with caution!
                                   [env: INFLUXDB3_ADMIN_TOKEN_RECOVERY_HTTP_BIND_ADDR]

✅ VERIFIED: Recovery endpoint parameter is shown with proper warning

2. Server Without Recovery Endpoint ✅

2.1 Start Server

$ cargo run -- serve --node-id node-1 --object-store file --data-dir /home/praveen/projects/influx/test-data/core --disable-telemetry-upload --http-bind 127.0.0.1:8181

2.2 Test Regenerate with Default Settings

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate

Result:

Are you sure you want to regenerate admin token? Enter 'yes' to confirm
Failed to create token, error: ApiError { code: 401, message: "" }

✅ VERIFIED: Correctly tries main server (8181) and gets 401 Unauthorized, NOT connection refused to 8182

3. Server With Default Recovery Endpoint ✅

3.1 Start Server with Recovery Endpoint

$ cargo run -- serve --node-id node-2 --object-store file --data-dir /home/praveen/projects/influx/test-data/core2 --disable-telemetry-upload --admin-token-recovery-http-bind --http-bind 127.0.0.1:8281

Server logs show:

INFO influxdb3_lib::commands::serve: Admin token recovery endpoint enabled - WARNING: This allows unauthenticated admin token regeneration! addr=127.0.0.1:8182
INFO influxdb3_server: starting admin token recovery endpoint on address=127.0.0.1:8182

3.2 Create Initial Admin Token

$ ./target/debug/influxdb3 create token --admin --host http://127.0.0.1:8281

Result: Token created successfully

3.3 Test Regenerate Without Explicit Host

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8281

Result:

Failed to create token, error: ApiError { code: 401, message: "" }

✅ VERIFIED: Tries main server, fails with 401 (not recovery endpoint)

3.4 Test Regenerate with Explicit Recovery Endpoint

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8182

Result: Successfully regenerated token

Server logs show:

INFO influxdb3_server::http: Admin token regenerated successfully, shutting down recovery endpoint
INFO influxdb3_server: Admin token recovery endpoint shutting down after token regeneration

3.5 Verify Recovery Endpoint Shutdown

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8182

Result:

Failed to create token, error: RequestSend { ... ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" }) } }

✅ VERIFIED: Recovery endpoint automatically shut down after successful regeneration

3.6 Test Regenerate on Main Server with Token

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8281 --token <NEW_TOKEN>

Result: Successfully regenerated token on main server

4. Server With Custom Recovery Endpoint ✅

4.1 Start Server with Custom Address

$ cargo run -- serve --node-id node-3 --object-store file --data-dir /home/praveen/projects/influx/test-data/core3 --disable-telemetry-upload --admin-token-recovery-http-bind 0.0.0.0:9999 --http-bind 127.0.0.1:8381

Server logs show:

INFO influxdb3_lib::commands::serve: Admin token recovery endpoint enabled - WARNING: This allows unauthenticated admin token regeneration! addr=0.0.0.0:9999
INFO influxdb3_server: starting admin token recovery endpoint on address=0.0.0.0:9999

4.2 Create Admin Token

$ ./target/debug/influxdb3 create token --admin --host http://127.0.0.1:8381

Result: Token created successfully

4.3 Test Default Regenerate (Should Fail)

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8381

Result:

Failed to create token, error: ApiError { code: 401, message: "" }

✅ VERIFIED: Does NOT try default 8182, correctly tries main server

4.4 Test with Custom Recovery Endpoint

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:9999

Result: Successfully regenerated token

4.5 Verify Shutdown

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:9999

Result: Connection refused
✅ VERIFIED: Custom recovery endpoint also shuts down after use

5. Recovery Endpoint Isolation ✅

5.1 Test Database Creation on Recovery Endpoint

$ ./target/debug/influxdb3 create database --host http://127.0.0.1:8182 testdb

Result:

Create command failed: server responded with error [404 Not Found]: not found

5.2 Test Table Creation on Recovery Endpoint

$ ./target/debug/influxdb3 create table --host http://127.0.0.1:8182 --database foo bar

Result:

Create command failed: server responded with error [404 Not Found]: not found

✅ VERIFIED: Recovery endpoint correctly returns 404 for all non-recovery operations

6. Edge Cases ✅

6.1 Server Without Admin Token

Start server with auth disabled:

$ cargo run -- serve --node-id node-5 --object-store file --data-dir /home/praveen/projects/influx/test-data/core5 --disable-telemetry-upload --admin-token-recovery-http-bind 127.0.0.1:8183 --http-bind 127.0.0.1:8581 --without-auth

Try to regenerate:

$ echo "yes" | ./target/debug/influxdb3 create token --admin --regenerate --host http://127.0.0.1:8183

Result:

Failed to create token, error: ApiError { code: 500, message: "missing admin token, cannot update" }

✅ VERIFIED: Correctly reports that there's no admin token to regenerate

Summary

All tests pass successfully! The key behavioral changes are verified:

  1. --regenerate now defaults to the main server (8181), not the recovery endpoint (8182)
  2. Users must explicitly specify --host to use the recovery endpoint
  3. Recovery endpoint still auto-shuts down after successful regeneration
  4. Recovery endpoint is properly isolated - returns 404 for non-recovery operations
  5. Edge cases handled correctly - proper error when no token exists to regenerate

The changes make the behavior more explicit and predictable, requiring users to consciously choose to use the less-secure recovery endpoint rather than having it as an automatic default.

Copy link
Contributor

@jdstrand jdstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! I didn't do a full round of manual testing (I see that you did), but can confirm that it is behaving as expected:

  • not listening on recovery endpoint when --admin-token-recovery-http-bind is not specified, listening on 127.0.0.1:8182 when --admin-token-recovery-http-bind is specified and listening elsewhere when --admin-token-recovery-http-bind has an argument
  • create token --admin --regenerate (without -H) requires a token regardless of if --admin-token-recovery-http-bind is specified (ie, it is hitting the main API)
  • create token --admin --regenerate -H <recovery endpoint> does not require a token and shuts down the endpoint after hitting it
  • token is properly regenerated in all cases

The usability issue, -H bug and connection refused bug with --regenerate when --admin-token-recovery-http-bind is not specified are all addressed.

We still have the issue with this command: create token --admin --token "bad" --regenerate --host <recovery endpoint> succeeding (because it is silently ignoring the --token argument). Not a blocker.

#[clap(
name = "regenerate",
long = "regenerate",
help = "Regenerate the operator token (uses port 8182 by default instead of 8181)"
help = "Regenerate the operator token. By default connects to the main server (http://127.0.0.1:8181).
To use the admin token recovery endpoint, specify --host with the recovery endpoint address"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

- new server to only serve admin token regeneration without an admin
  token has been added
- minor refactors to allow reuse of some of the utilities like trace
  layer for metrics moved to their own functions to allow them to be
  instantiated for both servers
- tests added to check if both the new server works right for
  regenerating token and also ensure none of the other functionalities
  are available on the admin token recovery server

closes: #26330
- recovery server now only starts when `--admin-token-recovery-http-bind` is passed in
- as soon as regeneration is done, the recovery server shuts itself down
- the select! macro logic has been changed such that shutting down
  recovery server does not shutdown the main server
- when `--regenerate` is passed in, `--host` still defaults to the main
  server. To get to the recovery server, `--host` with the recovery
  server address should be passed in
@praveen-influx praveen-influx force-pushed the pk/admin-token-recovery branch from 2f78473 to 40102ae Compare July 18, 2025 16:17
@praveen-influx
Copy link
Contributor Author

We still have the issue with this command: create token --admin --token "bad" --regenerate --host succeeding (because it is silently ignoring the --token argument). Not a blocker.

Good point, I can make --token indicate it is intended for main server. In current implementation I assumed --host takes priority in where it's intended to go. Not entirely related but when server is started --without-auth, the requests still are served if the token is set in the header (by silently ignoring the token). Once the pipeline succeeds I'll merge this PR, but please feel free to comment on this behavior and I can address it in a subsequent PR if we think it requires changing.

@jdstrand
Copy link
Contributor

Not entirely related but when server is started --without-auth, the requests still are served if the token is set in the header (by silently ignoring the token)

Perhaps it's sufficient to document that --token is ignored when used with the recovery endpoint (ie, just a 'help' update)? As noted, this is not a blocker.

Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@praveen-influx praveen-influx merged commit a6bec9c into main Jul 18, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Admin Token Recovery
3 participants