Only registred users can make comments

Resolve Stuck Namespaces in Kubernetes: A Step-by-Step Tutorial

namespace in terminating state tutorial

When attempting to delete a Kubernetes namespace, you might occasionally find it stuck in the Terminating state which is very annoying. This can happen for several reasons, including resources within the namespace that haven't been cleaned up properly, often due to finalizers. This tutorial provides a step-by-step guide on how to address this issue using the namespace example-ns as an example.

Available Video

There is an available video related to this blog post:

https://devoriales.com/video/929047976/namespace-stucked

What are "Finalizers" in Kubernetes

In Kubernetes, finalizers are mechanisms that prevent resources from being immediately deleted, allowing for clean-up operations to be performed. They are specified in the metadata.finalizers field of an object's manifest. When an object with finalizers is marked for deletion, it enters a "terminating" state, but the actual deletion is blocked until all finalizers are removed. This ensures that dependent or related operations, such as releasing external resources, can be completed safely before the object is fully deleted from the system.

A Step-by-Step Tutorial

First, let's confirm that our namespace, example-ns, is indeed stuck in the Terminating state:

kubectl get namespace example-ns

Output example:

NAME        STATUS        AGE
example-ns  Terminating   48d

Before you begin, it's good to describe the pod and see if you can find the reason for why it's in terminating state.

This you can do with the following command:

kubectl describe namespace <namespace-name>

For instance, I got the following output with my stucked namespace:

Output:

Status:       Terminating
Conditions:
  Type                                         Status  LastTransitionTime               Reason                Message
  ----                                         ------  ------------------               ------                -------
  NamespaceDeletionDiscoveryFailure            True    Fri, 29 Mar 2024 09:28:54 +0100  DiscoveryFailed       Discovery failed for some groups, 1 failing: unable to retrieve the complete list of server APIs: spdx.softwarecomposition.kubescape.io/v1beta1: stale GroupVersion discovery: spdx.softwarecomposition.kubescape.io/v1beta1
  NamespaceDeletionGroupVersionParsingFailure  False   Fri, 29 Mar 2024 09:28:55 +0100  ParsedGroupVersions   All legacy kube types successfully parsed
  NamespaceDeletionContentFailure              False   Fri, 29 Mar 2024 09:28:55 +0100  ContentDeleted        All content successfully deleted, may be waiting on finalization
  NamespaceContentRemaining                    True    Fri, 29 Mar 2024 09:28:55 +0100  SomeResourcesRemain   Some resources are remaining: persistentvolumeclaims. has 1 resource instances, pods. has 1 resource instances
  NamespaceFinalizersRemaining                 True    Fri, 29 Mar 2024 09:28:55 +0100  SomeFinalizersRemain  Some content in the namespace has finalizers remaining: kubernetes.io/pvc-protection in 1 resource instances

Step 1: Identify and Attempt to Remove Stuck Resources

  1. List All Resources in example-ns

    Begin by identifying all resources that might be preventing the namespace from being deleted.

❗You may imagine that the kubectl get all -n example-ns would provide all resources in the namespace, but that's simply not correct. This command will only provide most common resources like pods, deployment, configmap etc. But it will for instance not provide any CRDs.

So we need to run something that is providing the full list of namespace scoped resources:

kubectl get $(kubectl api-resources --verbs=list --namespaced -o name | awk NF | paste -sd "," -) -n example-ns --ignore-not-found

The output of this command may be pretty massive.

Attempt Graceful Resource Deletion

Now we could try to delete any resources you've identified as potentially problematic.

kubectl delete <resource-type> <resource-name> -n example-ns

If that doesn't work, we'd need to remove finalizers from the resource.

Check if there is any finalizer on that resource:

kubectl get <resource-type> <resource-name> -n example-ns -o json | jq '.metadata.finalizers'

Forcefully Remove Finalizers

Remove finalizers with caution to avoid leaving orphaned resources:

kubectl patch <resource-type> <resource-name> -n example-ns -p '{"metadata":{"finalizers":[]}}' --type=merge

❗You could also simply edit the resource and remove the finalizers manually from that object.

Step 3: Forcibly Remove the Namespace

If example-ns is still stuck, proceed to directly interact with the Kubernetes API.

  1. Start a Kubernetes API Server Proxy:

    kubectl proxy &
    
  2. Send a Request to Remove Namespace Finalizers

    Save an output from the namespace resource to a file, modify the finalizers and send it to the api:

    kubectl get namespace example-ns -o json > example-ns.json
    

    Edit the file, by removing the finalizers:

    "finalizers": []
    

    Assuming your modified JSON file is named example-ns.json and contains the namespace definition with the finalizers removed, you can use curl to send this file to the Kubernetes API server:

    curl -X PUT http://127.0.0.1:8001/api/v1/namespaces/example-ns/finalize -H "Content-Type: application/json" --data-binary "@example-ns.json"
  3. Finally, verify that example-ns has been deleted.
    kubectl get namespace example-ns
    

🚀 If there's no output, the namespace has been successfully removed.

❗Don't forget to kill the proxy process that we started earlier.

 Summary

This tutorial walked you through diagnosing and resolving a namespace stuck in the Terminating state, using example-ns as a fictive example. Starting with resource identification and deletion attempts, moving to manual finalizer removal, and ultimately using direct API interaction, this guide covers an approach to force-delete a Kubernetes namespace.

Comments