Skip to content

Define default collation for TiDB tables #22271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 5, 2024
Merged

Define default collation for TiDB tables #22271

merged 7 commits into from
Jun 5, 2024

Conversation

mneudert
Copy link
Member

@mneudert mneudert commented May 31, 2024

Description:

TiDB uses utf8mb4_general_bin as the default collation when creating a table. Compared to the MySQL default of utf8mb4_general_ci the sorting of some database results will be different.

This PR changes the table/database creation code to be more consistent across engines and actions:

  1. Always define ROW_FORMAT=dynamic

    Previously, the row format was only defined for tables created during a migration.

    For InnoDB, all tables currently should default to ROW_FORMAT=DYNAMIC, adding it to all tables by default should (as far as I know) not be an issue.

    Note: TiDB currently ignores the ROW_FORMAT and always uses COMPACT (docs).

  2. Set a default collation for TiDB tables

    When TiDB is configured, and the (default) utf8mb4 charset is used, all created tables will be created with a matching COLLATE=utf8mb4_0900_ai_ci.

  3. Set a default collation for TiDB databases

    Creating a database also sets the default collation if TiDB is configured.

Refs DEV-18061

Review

@mneudert mneudert self-assigned this May 31, 2024
@mneudert mneudert added this to the 5.2.0 milestone Jun 3, 2024
@mneudert mneudert added c: Platform For Matomo platform changes that aren't impacting any of our APIs but improve the core itself. c: Tests & QA For issues related to automated tests or making it easier to QA & test issues. Needs Review PRs that need a code review Technical debt Issues the will help to reduce technical debt labels Jun 3, 2024
@mneudert mneudert requested a review from a team June 3, 2024 18:04
@mneudert mneudert marked this pull request as ready for review June 3, 2024 18:04
@mneudert mneudert removed their assignment Jun 3, 2024
Copy link
Member

@sgiehl sgiehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment for an improvement. But we could also change that as soon as it is needed though. Otherwise the PR looks good 👍

{
return $this->getDbSettings()->getEngine();
}

protected function getTableRowFormat(): string
{
return $this->getDbSettings()->getRowFormat();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe fully move the getRowFormat away from DbSettings into the Schema? That way we could also simply leave it away for TiDb in general. Or if ever needed use another one for TiDb if they start adding something around that.

@michalkleiner
Copy link
Contributor

Left a comment for an improvement. But we could also change that as soon as it is needed though. Otherwise the PR looks good 👍

We can address this separately if needed, I'll merge this now so that we can run rebase and another set of tests for TiDB.

@michalkleiner michalkleiner merged commit 315d29c into 5.x-dev Jun 5, 2024
@michalkleiner michalkleiner deleted the dev-18061 branch June 5, 2024 00:29
mneudert added a commit that referenced this pull request Sep 11, 2024
* Define consistent db settings access
* Define reusable table creation options
* Create all tables with explicit row format
* Define default collation for TiDB tables
* Use table create options from schema in migrations
* Create TiDB databases with default collation
* Update OmniFixture dump to include ROW_FORMAT
sgiehl pushed a commit that referenced this pull request Sep 13, 2024
* Define default collation for TiDB tables (#22271)

* Add configuration for database connection collation (#22564)

* Backport test changes from #22356
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Platform For Matomo platform changes that aren't impacting any of our APIs but improve the core itself. c: Tests & QA For issues related to automated tests or making it easier to QA & test issues. Needs Review PRs that need a code review Technical debt Issues the will help to reduce technical debt
Development

Successfully merging this pull request may close these issues.

3 participants