Skip to content

tidb_enable_tso_follower_proxy causes 10s unavailability during graceful leader transfer #9188

@Tema

Description

@Tema

Bug Report

What did you do?

  1. Enable PD proxy:
mysql> SET GLOBAL tidb_enable_tso_follower_proxy = ON;
Query OK, 0 rows affected (0.11 sec)
  1. Start write intensive benchmark:
sysbench oltp_update_non_index --non_index_updates=1 --mysql-ignore-errors=1105 --skip_trx=true --tables=1 --table_size=16384  --threads=64 --mysql-host=${HOST} --mysql-user=root --mysql-db=test --mysql-port=4000 --report-interval=1 --time=300 run
...
[ 1s ] thds: 64 tps: 1944.84 qps: 1944.84 (r/w/o: 0.00/1944.84/0.00) lat (ms,95%): 112.67 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 64 tps: 1934.00 qps: 1934.00 (r/w/o: 0.00/1934.00/0.00) lat (ms,95%): 121.08 err/s: 0.00 reconn/s: 0.00
...
  1. Resign the leader:
pd-ctl >> » member leader resign
Success!

What did you expect to see?

No or minimal subsecond impact like in the case of resigning leader with tidb_enable_tso_follower_proxy=OFF

[ 139s ] thds: 64 tps: 1976.67 qps: 1976.67 (r/w/o: 0.00/1976.67/0.00) lat (ms,95%): 123.28 err/s: 0.00 reconn/s: 0.00
[ 140s ] thds: 64 tps: 1701.14 qps: 1701.14 (r/w/o: 0.00/1701.14/0.00) lat (ms,95%): 137.35 err/s: 2.00 reconn/s: 0.00
[ 141s ] thds: 64 tps: 1968.29 qps: 1968.29 (r/w/o: 0.00/1968.29/0.00) lat (ms,95%): 121.08 err/s: 0.00 reconn/s: 0.00

Above I observe negligible 10% QPS drop recovered within 1 second.

What did you see instead?

Observe 10 seconds down time:

[ 19s ] thds: 64 tps: 1957.00 qps: 1957.00 (r/w/o: 0.00/1957.00/0.00) lat (ms,95%): 123.28 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 64 tps: 689.90 qps: 689.90 (r/w/o: 0.00/689.90/0.00) lat (ms,95%): 112.67 err/s: 0.00 reconn/s: 0.00
[ 21s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 22s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 23s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 24s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 25s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 26s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 27s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 28s ] thds: 64 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 29s ] thds: 64 tps: 994.03 qps: 994.03 (r/w/o: 0.00/994.03/0.00) lat (ms,95%): 9118.47 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 64 tps: 1926.30 qps: 1926.30 (r/w/o: 0.00/1926.30/0.00) lat (ms,95%): 114.72 err/s: 0.00 reconn/s: 0.00

What version of PD are you using (pd-server -V)?

Both TiDB and PD are at LTS 8.5.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.affects-9.0This bug affects the 9.0.x versions.severity/majortype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions