Skip to content

Partitions are not bootstrapped after purge operation #35650

@saig0

Description

@saig0

Description

I'm running my process tests with CPT on a local Camunda 8 Run (8.8.0-alpha6) configured as a remote runtime.

This works fine with a fresh Camunda 8 Run distribution. However, after I restart the distribution, my process tests fail when deleting the data between the test cases.

This is a major issue for running process tests with CPT on a remote runtime.


I can reproduce the behavior without CPT by purging the data of the Camunda 8 Run distribution via HTTP request and checking the partitions in the topology. After the restart, the broker doesn't have any partitions in the topology.


Workaround
When I delete the data directory in the camunda-zeebe-* directory, the partitions are bootstrapped successfully after the purge, and the process tests run again.

Steps to reproduce

  1. Check the topology (GET http://localhost:8080/v2/topology)
  2. Purge the data and remember the change ID (POST http://localhost:9600/actuator/cluster/purge)
  3. Check the topology and verify that the change ID is successfully applied (GET http://localhost:8080/v2/topology)
  4. Restart the distribution
  5. Check the topology (GET http://localhost:8080/v2/topology)
  6. Purge the data and remember the change ID (POST http://localhost:9600/actuator/cluster/purge)
  7. Check the topology and verify that the change ID is successfully applied (GET http://localhost:8080/v2/topology)

Then, verify that the broker doesn't have any partitions in the topology.

Current behavior

First run

Topology before purge:

{
  "brokers": [
    {
      "nodeId": 0,
      "host": "127.0.1.1",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "leader",
          "health": "healthy"
        }
      ],
      "version": "8.8.0-alpha6"
    }
  ],
  "clusterSize": 1,
  "partitionsCount": 1,
  "replicationFactor": 1,
  "gatewayVersion": "8.8.0-alpha6",
  "lastCompletedChangeId": "-1"
}

Purge response:

{
  "changeId": 2,
  "currentTopology": [
    {
      "id": 0,
      "state": "ACTIVE",
      "version": 0,
      "lastUpdatedAt": "0001-01-01T00:00:00.000+0000",
      "partitions": [
        {
          "id": 1,
          "state": "ACTIVE",
          "priority": 1,
          "config": {
            "exporting": {
              "exporters": [
                {
                  "id": "elasticsearch",
                  "state": "ENABLED"
                },
                {
                  "id": "camundaExporter",
                  "state": "ENABLED"
                }
              ]
            }
          }
        }
      ]
    }
  ],
  "plannedChanges": [
    {
      "operation": "PARTITION_LEAVE",
      "brokerId": 0,
      "partitionId": 1,
      "brokers": []
    },
    {
      "operation": "DELETE_HISTORY",
      "brokers": []
    },
    {
      "operation": "PARTITION_BOOTSTRAP",
      "brokerId": 0,
      "partitionId": 1,
      "priority": 1,
      "brokers": []
    }
  ],
  "expectedTopology": [
    {
      "id": 0,
      "state": "ACTIVE",
      "version": 4,
      "lastUpdatedAt": "2025-07-21T09:08:31.991+0000",
      "partitions": [
        {
          "id": 1,
          "state": "ACTIVE",
          "priority": 1,
          "config": {
            "exporting": {
              "exporters": [
                {
                  "id": "elasticsearch",
                  "state": "ENABLED"
                },
                {
                  "id": "camundaExporter",
                  "state": "ENABLED"
                }
              ]
            }
          }
        }
      ]
    }
  ]
}

Topology after purge:

{
  "brokers": [
    {
      "nodeId": 0,
      "host": "127.0.1.1",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "leader",
          "health": "healthy"
        }
      ],
      "version": "8.8.0-alpha6"
    }
  ],
  "clusterSize": 1,
  "partitionsCount": 1,
  "replicationFactor": 1,
  "gatewayVersion": "8.8.0-alpha6",
  "lastCompletedChangeId": "2"
}

Second run

Topology before purge:

{
  "brokers": [
    {
      "nodeId": 0,
      "host": "127.0.1.1",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "leader",
          "health": "healthy"
        }
      ],
      "version": "8.8.0-alpha6"
    }
  ],
  "clusterSize": 1,
  "partitionsCount": 1,
  "replicationFactor": 1,
  "gatewayVersion": "8.8.0-alpha6",
  "lastCompletedChangeId": "2"
}

Purge response:

{
  "changeId": 4,
  "currentTopology": [
    {
      "id": 0,
      "state": "ACTIVE",
      "version": 4,
      "lastUpdatedAt": "2025-07-21T09:08:32.707+0000",
      "partitions": [
        {
          "id": 1,
          "state": "ACTIVE",
          "priority": 1,
          "config": {
            "exporting": {
              "exporters": [
                {
                  "id": "elasticsearch",
                  "state": "ENABLED"
                },
                {
                  "id": "camundaExporter",
                  "state": "ENABLED"
                }
              ]
            }
          }
        }
      ]
    }
  ],
  "plannedChanges": [
    {
      "operation": "PARTITION_LEAVE",
      "brokerId": 0,
      "partitionId": 1,
      "brokers": []
    },
    {
      "operation": "DELETE_HISTORY",
      "brokers": []
    },
    {
      "operation": "PARTITION_BOOTSTRAP",
      "brokerId": 0,
      "partitionId": 1,
      "priority": 1,
      "brokers": []
    }
  ],
  "expectedTopology": [
    {
      "id": 0,
      "state": "ACTIVE",
      "version": 8,
      "lastUpdatedAt": "2025-07-21T09:10:54.843+0000",
      "partitions": [
        {
          "id": 1,
          "state": "ACTIVE",
          "priority": 1,
          "config": {
            "exporting": {
              "exporters": [
                {
                  "id": "elasticsearch",
                  "state": "ENABLED"
                },
                {
                  "id": "camundaExporter",
                  "state": "ENABLED"
                }
              ]
            }
          }
        }
      ]
    }
  ]
}

Topology after purge:

{
  "brokers": [
    {
      "nodeId": 0,
      "host": "127.0.1.1",
      "port": 26501,
      "partitions": [],
      "version": "8.8.0-alpha6"
    }
  ],
  "clusterSize": 1,
  "partitionsCount": 1,
  "replicationFactor": 1,
  "gatewayVersion": "8.8.0-alpha6",
  "lastCompletedChangeId": "4"
}

Expected behavior

After a restart of the distribution, the broker bootstraps the partitions successfully after a purge operation.

Environment

SM

Version

  • camunda-process-test-spring:8.8.0-alpha6
  • Camunda 8 Run: 8.8.0-alpha6

Rootcause

The issue is related to the purge operation.

The persisted data seems to cause an issue with the purge.

Solution ideas

No response

Dev -> QA handover

  • Resources:
  • Versions to validate:
  • Release version (in which version this feature will be released):

Links

Camunda 8 Run log file (second run): camunda.log

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions