TTL Controller for Finished Resources
Kubernetes v1.21 [beta]
The TTL controller provides a TTL (time to live) mechanism to limit the lifetime of resource objects that have finished execution. TTL controller only handles Jobs for now, and may be expanded to handle other resources that will finish execution, such as Pods and custom resources.
This feature is currently beta and enabled by default, and can be disabled via
TTLAfterFinished in both kube-apiserver and kube-controller-manager.
The TTL controller only supports Jobs for now. A cluster operator can use this feature to clean
up finished Jobs (either
Failed) automatically by specifying the
.spec.ttlSecondsAfterFinished field of a Job, as in this
The TTL controller will assume that a resource is eligible to be cleaned up
TTL seconds after the resource has finished, in other words, when the TTL has expired. When the
TTL controller cleans up a resource, it will delete it cascadingly, that is to say it will delete
its dependent objects together with it. Note that when the resource is deleted,
its lifecycle guarantees, such as finalizers, will be honored.
The TTL seconds can be set at any time. Here are some examples for setting the
.spec.ttlSecondsAfterFinished field of a Job:
- Specify this field in the resource manifest, so that a Job can be cleaned up automatically some time after it finishes.
- Set this field of existing, already finished resources, to adopt this new feature.
- Use a mutating admission webhook to set this field dynamically at resource creation time. Cluster administrators can use this to enforce a TTL policy for finished resources.
- Use a mutating admission webhook to set this field dynamically after the resource has finished, and choose different TTL values based on resource status, labels, etc.
Updating TTL Seconds
Note that the TTL period, e.g.
.spec.ttlSecondsAfterFinished field of Jobs,
can be modified after the resource is created or has finished. However, once the
Job becomes eligible to be deleted (when the TTL has expired), the system won't
guarantee that the Jobs will be kept, even if an update to extend the TTL
returns a successful API response.
Because TTL controller uses timestamps stored in the Kubernetes resources to determine whether the TTL has expired or not, this feature is sensitive to time skew in the cluster, which may cause TTL controller to clean up resource objects at the wrong time.
In Kubernetes, it's required to run NTP on all nodes (see #6159) to avoid time skew. Clocks aren't always correct, but the difference should be very small. Please be aware of this risk when setting a non-zero TTL.