Skip to content

TOB-K8S-007: Log rotation is not atomic  #81132

@cji

Description

@cji

This issue was reported in the Kubernetes Security Audit Report

Description
kubelets use a log to store metadata about the container system, such as readiness status. As is normal for logging, kubelets will rotate their logs under certain conditions:

// rotateLatestLog rotates latest log without compression, so that container can still write
// and fluentd can finish reading.
func (c *containerLogManager) rotateLatestLog(id, log string) error {
    timestamp := c.clock.Now().Format(timestampFormat)
    rotated := fmt.Sprintf("%s.%s", log, timestamp)
    if err := os.Rename(log, rotated); err != nil {
        return fmt.Errorf("failed to rotate log %q to %q: %v", log, rotated, err)
    }
    if err := c.runtimeService.ReopenContainerLog(id); err != nil {
        // Rename the rotated log back, so that we can try rotating it again
        // next round.
        // If kubelet gets restarted at this point, we'll lose original log.
        if renameErr := os.Rename(rotated, log); renameErr != nil {
            // This shouldn't happen.
            // Report an error if this happens, because we will lose original
            // log.
            klog.Errorf("Failed to rename rotated log %q back to %q: %v, reopen container log error: %v", rotated, log, renameErr, err)
        }
        return fmt.Errorf("failed to reopen container log %q: %v", id, err)
    }
    return nil
}

Figure 22.1: One of the log rotation mechanisms within kubelet

However, if the kubelet were restarted during the rotation, the logs and their contents would be lost. This could have a wide range of impacts to the end user, from missing threat-hunting data to simple error discovery.

Exploit Scenario
Alice is running a Kubernetes cluster for her organization. Eve has position sufficient to watch the logs, and understands when log rotation will occur. Even then faults a kubelet when rotation occurs, ensuring that the logs are removed.

Recommendation
Short term, move to a copy-then-rename approach. This will ensure that logs aren’t lost from simple rename mishaps, and that at worst they are named under a transient name.

Long term, shift away from log rotation and move towards persistent logs regardless of location. This would mean that logs would be written to in linear order, and a new log would be created whenever rotation is required.

Anything else we need to know?:

See #81146 for current status of all issues created from these findings.

The vendor gave this issue an ID of TOB-K8S-007 and it was finding 24 of the report.

The vendor considers this issue Low Severity.

To view the original finding, begin on page 65 of the Kubernetes Security Review Report

Environment:

  • Kubernetes version: 1.13.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/securityhelp wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.wg/security-auditCategorizes an issue or PR as relevant to WG Security Audit.

    Type

    No type

    Projects

    Status

    Triaged

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions