Skip to content

Initial database migration is not atomic in 3.9.1 and can become wedged if interrupted #2074

@hfinucane

Description

@hfinucane

Bug Report

  • Concourse version: 3.9.1
  • Deployment type (BOSH/Docker/binary): binary

This appears to be a regression of #722. As a refresher, my Chef deployment of Concourse is also a torture test of initial migration safety, because it runs 'concourse web', and then pretty rapidly restarts that process. I'm seeing this in my 'integration tests' on fresh systems with a fresh db. On the Concourse side, I get

failed to migrate database: Dirty database version 1510262030. Fix and force version.

With verbose postgres logs, the problem seems like a straightforward repeat of #722- there are two separate transactions, one to update the schema_migrations table, one to apply the schema migrations, and if the second transaction is interrupted, the database is in an indeterminate state.

LOG:  execute <unnamed>: SELECT pg_try_advisory_lock($1)
DETAIL:  parameters: $1 = '6'
LOG:  statement: SELECT version FROM migration_version
ERROR:  relation "migration_version" does not exist at character 21
STATEMENT:  SELECT version FROM migration_version
LOG:  statement: SELECT CURRENT_DATABASE()
LOG:  execute <unnamed>: SELECT COUNT(1) FROM information_schema.tables WHERE table_name = $1 AND table_schema = (SELECT current_schema()) LIMIT 1
DETAIL:  parameters: $1 = 'schema_migrations'
LOG:  statement: CREATE TABLE "schema_migrations" (version bigint not null primary key, dirty boolean not null)
LOG:  execute <unnamed>: SELECT pg_try_advisory_lock($1)
DETAIL:  parameters: $1 = '2019668375'
LOG:  statement: SELECT version, dirty FROM "schema_migrations" LIMIT 1
LOG:  statement: BEGIN
LOG:  statement: TRUNCATE "schema_migrations"
LOG:  execute <unnamed>: INSERT INTO "schema_migrations" (version, dirty) VALUES ($1, $2)
DETAIL:  parameters: $1 = '1510262030', $2 = 't'
LOG:  statement: COMMIT
LOG:  statement: BEGIN;

    CREATE TYPE build_status AS ENUM (
          'pending',
        'started',
        'aborted',

... (omit the concourse schema)

          ALTER TABLE ONLY workers
              ADD CONSTRAINT workers_team_id_fkey FOREIGN KEY (team_id) REFERENCES teams(id) ON DELETE CASCADE;


        COMMIT;

LOG:  could not receive data from client: Connection reset by peer
LOG:  could not receive data from client: Connection reset by peer
LOG:  execute <unnamed>: SELECT pg_try_advisory_lock($1)
DETAIL:  parameters: $1 = '6'
LOG:  statement: SELECT version FROM migration_version
ERROR:  relation "migration_version" does not exist at character 21
STATEMENT:  SELECT version FROM migration_version
LOG:  statement: SELECT CURRENT_DATABASE()
LOG:  execute <unnamed>: SELECT COUNT(1) FROM information_schema.tables WHERE table_name = $1 AND table_schema = (SELECT current_schema()) LIMIT 1
DETAIL:  parameters: $1 = 'schema_migrations'
LOG:  execute <unnamed>: SELECT pg_try_advisory_lock($1)
DETAIL:  parameters: $1 = '2019668375'
LOG:  statement: SELECT version, dirty FROM "schema_migrations" LIMIT 1
LOG:  execute <unnamed>: SELECT pg_advisory_unlock($1)
DETAIL:  parameters: $1 = '2019668375'
LOG:  execute <unnamed>: SELECT pg_advisory_unlock($1)
DETAIL:  parameters: $1 = '6'
LOG:  could not receive data from client: Connection reset by peer

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions