Skip to content

Agent fails to cancel kubernetes jobs  #7739

@BitTheByte

Description

@BitTheByte

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the Prefect documentation for this issue.
  • I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

Agent fails while canceling running Kubernetes jobs with ConfigException as there is no ~/.kube/config file will be present or any cluster context information if the job is running inside a Kubernetes job

Reproduction

-

Error

01:52:27.718 | INFO    | prefect.agent - Found 1 flow runs awaiting cancellation.
01:52:28.155 | INFO    | prefect.agent - Killing kubernetes-job in-cluster-config:ingenious-oxpecker-r9vr4 for flow run '1d44c745-6d8d-4b3a-a338-6d89117c7da4'...
01:52:28.160 | ERROR   | prefect.agent - Encountered exception while killing infrastructure for flow run '1d44c745-6d8d-4b3a-a338-6d89117c7da4'. Flow run may not be cancelled.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/prefect/agent.py", line 285, in 
cancel_run
    await infrastructure.kill(flow_run.infrastructure_pid)
  File 
"/usr/local/lib/python3.10/dist-packages/prefect/infrastructure/kubernetes.py", 
line 282, in kill
    current_cluster = self._get_active_cluster_name()
  File 
"/usr/local/lib/python3.10/dist-packages/prefect/infrastructure/kubernetes.py", 
line 347, in _get_active_cluster_name
    _, active_context = kubernetes.config.list_kube_config_contexts()
  File 
"/usr/local/lib/python3.10/dist-packages/kubernetes/config/kube_config.py", line
789, in list_kube_config_contexts
    loader = _get_kube_config_loader(filename=config_file)
  File 
"/usr/local/lib/python3.10/dist-packages/kubernetes/config/kube_config.py", line
770, in _get_kube_config_loader
    raise ConfigException(
kubernetes.config.config_exception.ConfigException: Invalid kube-config file. No
configuration found.

Versions

Version:             2.7.0
API version:         0.8.3
Python version:      3.10.8
Git commit:          c2833339
Built:               Thu, Dec 1, 2022 4:03 PM
OS/Arch:             linux/x86_64
Profile:             default
Server type:         hosted

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions