Skip to content

Conversation

locker
Copy link
Member

@locker locker commented May 25, 2023

Description

Executing DDL operations with an old (not upgraded) system schema is dangerous and might result in unexpected breakages. So we decided to explicitly forbid all DDL operations with an old system schema until box.schema.upgrade() is called. Note, one can still call box.schema functions with an old schema provided they do nothing, for example, if an object is created with the if_not_exists flag and the object with same id already exists:

box.schema.create_space('test', {if_not_exists = true})

Otherwise an attempt to create a space with an old schema will raise an error like shown below:

tarantool> box.schema.space.create('test')
---
- error: Your schema version is 1.6.8 while Tarantool
    3.0.0-entrypoint-262-g3eaba1cef686 requires a more recent
    schema version. Please, consider using box.schema.upgrade().
...

Implementation details

We disable DDL by patching the existing on_replace_dd_system_space trigger callback installed for each system space so that now it raises an error in case the current schema version is less than the most recent one known to this build. Since to perform a schema upgrade we need to execute DDL, we suppress the error for the fiber that is currently running a schema upgrade. To achieve that, the upgrade script calls box_schema_upgrade_begin and box_schema_upgrade_end before starting and after completing a schema upgrade. The functions keep track of the fiber that is currently running a schema upgrade so that we can allow all DDL operations for it. We also allow DDL during recovery so that we can replay DDL statements written to the WAL.

Since there may be a bug in the box.schema.upgrade implementation, we export box.internal.run_schema_upgrade, which runs the given function as a schema upgrade script (allowing DDL). The user may use this function to recover after a schema upgrade failure.

Note about the tests

A test server instance started by luatest grants permissions to the guest user so that luatest can execute commands on it. It means that if a test uses a generated snap file committed to the repository for a test server instance, it will fail because granting permissions is a DDL operation. To prevent this, we have to regenerate snap files so that they contain all required permissions. This works because a test server instance grants permissions with the if_not_exists flag.

The problem is that it isn't easy to regenerate the snap files for the following tests because there's no generator script:

  • test/box-luatest/gh_6794_recover_nonmatching_xlogs_test.lua
  • test/box-luatest/gh_7974_force_recovery_bugs_test.lua

So we temporarily disable these tests and file tickets to fix them, see #8701 #8702.

Other notes:

  • We drop test/box-luatest/upgrade/2.9.1 and make the test using it use test/box-luatest/upgrade/2.10.0 instead. We do this because 2.9.1 was never released and the earliest Tarantool version using the 2.9.1 schema version is 2.10.0. This shouldn't affect the test anyhow.
  • We drop the part of the user_auth_history_last_modified_upgrade test that checks that creating users/roles with an old schema works fine because this is forbidden now.
  • We wrap the code that creates a space with an old schema in the downgrade test in box.internal.run_schema_upgrade. Even though it's unsupported now, we still need to check that space creation works after a downgrade.

Closes #7149

@locker locker requested a review from a team as a code owner May 25, 2023 12:50
@coveralls
Copy link

Coverage Status

Coverage: 85.763% (-0.02%) from 85.788% when pulling 5f7fa26 on locker:gh-7149-forbid-ddl-until-box-schema-upgrade into 9f9142d
on tarantool:master
.

@locker locker requested a review from alyapunov May 25, 2023 13:43
Copy link
Contributor

@alyapunov alyapunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
Nit: upgrde -> upgrade in the commit message.

@alyapunov alyapunov assigned locker and unassigned alyapunov May 26, 2023
** Implementation details **

We disable DDL by patching the existing on_replace_dd_system_space
trigger callback installed for each system space so that now it raises
an error in case the current schema version is less than the most
recent one known to this build. Since to perform a schema upgrade
we need to execute DDL, we suppress the error for the fiber that is
currently running a schema upgrade. To achieve that, the upgrade script
calls box_schema_upgrade_begin and box_schema_upgrade_end before
starting and after completing a schema upgrade. The functions keep track
of the fiber that is currently running a schema upgrade so that we can
allow all DDL operations for it. We also allow DDL during recovery so
that we can replay DDL statements written to the WAL.

Since there may be a bug in the `box.schema.upgrade` implementation,
we export `box.internal.run_schema_upgrade`, which runs the given
function as a schema upgrade script (allowing DDL). The user may use
this function to recover after a schema upgrade failure.

** Note about the tests **

A test server instance started by luatest grants permissions to the
guest user so that luatest can execute commands on it. It means that if
a test uses a generated snap file committed to the repository for a test
server instance, it will fail because granting permissions is a DDL
operation. To prevent this, we have to regenerate snap files so that
they contain all required permissions. This works because a test server
instance grants permissions with the `if_not_exists` flag.

The problem is that it isn't easy to regenerate the snap files for the
following tests because there's no generator script:
 - `test/box-luatest/gh_6794_recover_nonmatching_xlogs_test.lua`
 - `test/box-luatest/gh_7974_force_recovery_bugs_test.lua`

So we temporarily disable these tests and file tickets to fix them.

Other notes:
 - We drop `test/box-luatest/upgrade/2.9.1` and make the test using it
   use `test/box-luatest/upgrade/2.10.0` instead. We do this because
   2.9.1 was never released and the earliest Tarantool version using the
   2.9.1 schema version is 2.10.0. This shouldn't affect the test
   anyhow.
 - We drop the part of the `user_auth_history_last_modified_upgrade`
   test that checks that creating users/roles with an old schema works
   fine because this is forbidden now.
 - We wrap the code that creates a space with an old schema in the
   downgrade test in `box.internal.run_schema_upgrade`. Even though it's
   unsupported now, we still need to check that space creation works
   after a downgrade.

Closes tarantool#7149

@TarantoolBot document
Title: Document that DDL is disabled with an old system schema

Executing DDL operations with an old (not upgraded) system schema is
dangerous and might result in unexpected breakages. So we decided to
explicitly forbid all DDL operations with an old system schema until
`box.schema.upgrade()` is called.  Note, one can still call `box.schema`
functions with an old schema provided they do nothing, for example, if
an object is created with the `if_not_exists` flag and the object with
same id already exists:

```lua
box.schema.create_space('test', {if_not_exists = true})
```

Otherwise an attempt to create a space with an old schema will raise
an error like shown below:

```yaml
tarantool> box.schema.space.create('test')
---
- error: Your schema version is 1.6.8 while Tarantool
    3.0.0-entrypoint-262-g3eaba1cef686 requires a more recent
    schema version. Please, consider using box.schema.upgrade().
...
```
@locker locker force-pushed the gh-7149-forbid-ddl-until-box-schema-upgrade branch from 5f7fa26 to 8163305 Compare May 26, 2023 10:34
@locker locker added the full-ci Enables all tests for a pull request label May 26, 2023
@locker locker merged commit 97c2c9a into tarantool:master May 26, 2023
@locker locker deleted the gh-7149-forbid-ddl-until-box-schema-upgrade branch May 26, 2023 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full-ci Enables all tests for a pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Forbid DDL before box.schema.upgrade is called
5 participants