The --exact-rowcount option is buggy on large tables

### Context

During a schema change, gho-st copies every row from the original table to a ghost table. This results in a lock on the original table and a long running query similar to the following

```
mysql> show open tables where in_use>0;
+------------+----------------------+--------+-------------+
| Database   | Table                | In_use | Name_locked |
+------------+----------------------+--------+-------------+
| my_service | my_table             |      1 |           0 |
+------------+----------------------+--------+-------------+
1 row in set (0.00 sec)

mysql> show full processlist;
...
| 1594199 | my_service_ddl | 10.137.166.91:35626  | my_service         | Query       |    0 | Sending data                                                  | select  /* gh-ost `my_service`.`my_table` iteration:108 */
						`id`
					from
						`my_service`.`my_table`
					where ((`id` > _binary'1131891')) and ((`id` < _binary'310834793') or ((`id` = _binary'310834793')))
					order by
						`id` asc
					limit 1
					offset 9999 |
```

It appears that when [`--exact-rowcount`](https://github.com/github/gh-ost/blob/master/doc/command-line-flags.md#exact-rowcount) is enabled, ghost grabs another lock on the original table and starts an addition query to calculate a more accurate progress ETA. By default, row counting happens in parallel to row copying. 

```
mysql> show open tables where in_use>0;
+------------+----------------------+--------+-------------+
| Database   | Table                | In_use | Name_locked |
+------------+----------------------+--------+-------------+
| my_service | my_table             |      2 |           0 |
+------------+----------------------+--------+-------------+
1 row in set (0.00 sec)

mysql> show full processlist;
...
| 1594046 | my_service_ddl | 10.137.166.91:57666  | my_service         | Query       |   71 | Sending data                  | select /* gh-ost */ count(*) as rows from `my_service`.`my_table`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
...
```

### Problem
After all the rows are copied to the ghost table, gh-ost attempts to cut-over the original table. 

Usually, the row counting query finishes before all the rows are copied to the ghost table. At the time of cut-over, because both queries have finished, there is no lock on the origin table. Hence the schema change is successful. 

However, we repeatedly found that on a large table (> 400GB), row counting query takes longer than row copying. At the time of cut-over, the row counting query is still running and holding a lock on the original table. 

This causes the schema change to fail with this error: "Error 1205: Lock wait timeout exceeded; try restarting transaction", which was described in https://github.com/github/gh-ost/issues/773

### Proposals
If you run into this issue, try disabling `--exact-rowcount`.

Improvements that we could make
* Document this bug; add a warning.
* Teach gh-ost to terminate the row counting query before cutting over the original table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The --exact-rowcount option is buggy on large tables #830

Context

Problem

Proposals

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The --exact-rowcount option is buggy on large tables #830

Description

Context

Problem

Proposals

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions