Pod Security Standards
Security settings for Pods are typically applied by using security contexts. Security Contexts allow for the definition of privilege and access controls on a per-Pod basis.
The enforcement and policy-based definition of cluster requirements of security contexts has previously been achieved using Pod Security Policy. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the Pod specification.
However, numerous means of policy enforcement have arisen that augment or replace the use of PodSecurityPolicy. The intent of this page is to detail recommended Pod security profiles, decoupled from any specific instantiation.
There is an immediate need for base policy definitions to broadly cover the security spectrum. These should range from highly restricted to highly flexible:
- Privileged - Unrestricted policy, providing the widest possible level of permissions. This policy allows for known privilege escalations.
- Baseline - Minimally restrictive policy while preventing known privilege escalations. Allows the default (minimally specified) Pod configuration.
- Restricted - Heavily restricted policy, following current Pod hardening best practices.
The Privileged policy is purposely-open, and entirely unrestricted. This type of policy is typically aimed at system- and infrastructure-level workloads managed by privileged, trusted users.
The privileged policy is defined by an absence of restrictions. For allow-by-default enforcement mechanisms (such as gatekeeper), the privileged profile may be an absence of applied constraints rather than an instantiated policy. In contrast, for a deny-by-default mechanism (such as Pod Security Policy) the privileged policy should enable all controls (disable all restrictions).
The Baseline policy is aimed at ease of adoption for common containerized workloads while preventing known privilege escalations. This policy is targeted at application operators and developers of non-critical applications. The following listed controls should be enforced/disallowed:
Sharing the host namespaces must be disallowed.
Allowed Values: false
Privileged Pods disable most security mechanisms and must be disallowed.
Allowed Values: false, undefined/nil
Adding NET_RAW or capabilities beyond the default set must be disallowed.
Allowed Values: empty (or restricted to a known list)
HostPath volumes must be forbidden.
Allowed Values: undefined/nil
HostPorts should be disallowed, or at minimum restricted to a known list.
Allowed Values: 0, undefined (or restricted to a known list)
On supported hosts, the 'runtime/default' AppArmor profile is applied by default.
The baseline policy should prevent overriding or disabling the default AppArmor
profile, or restrict overrides to an allowed set of profiles.
Allowed Values: 'runtime/default', undefined
Setting the SELinux type is restricted, and setting a custom SELinux user or role option is forbidden.
Allowed Values: undefined/empty
|/proc Mount Type||
The default /proc masks are set up to reduce attack surface, and should be required.
Allowed Values: undefined/nil, 'Default'
Sysctls can disable security mechanisms or affect all containers on a host, and should be disallowed except for an allowed "safe" subset.
A sysctl is considered safe if it is namespaced in the container or the Pod, and it is isolated from other Pods or processes on the same Node.
The Restricted policy is aimed at enforcing current Pod hardening best practices, at the expense of some compatibility. It is targeted at operators and developers of security-critical applications, as well as lower-trust users.The following listed controls should be enforced/disallowed:
|Everything from the baseline profile.|
In addition to restricting HostPath volumes, the restricted profile limits usage of non-ephemeral volume types to those defined through PersistentVolumes.
Allowed Values: undefined/nil
Privilege escalation (such as via set-user-ID or set-group-ID file mode) should not be allowed.
Allowed Values: false
|Running as Non-root||
Containers must be required to run as non-root users.
Allowed Values: true
|Non-root groups (optional)||
Containers should be forbidden from running with a root primary or supplementary GID.
undefined / nil (except for `*.runAsGroup`)
The RuntimeDefault seccomp profile must be required, or allow specific additional profiles.
undefined / nil
Decoupling policy definition from policy instantiation allows for a common understanding and consistent language of policies across clusters, independent of the underlying enforcement mechanism.
As mechanisms mature, they will be defined below on a per-policy basis. The methods of enforcement of individual policies are not defined here.
Why isn't there a profile between privileged and baseline?
The three profiles defined here have a clear linear progression from most secure (restricted) to least secure (privileged), and cover a broad set of workloads. Privileges required above the baseline policy are typically very application specific, so we do not offer a standard profile in this niche. This is not to say that the privileged profile should always be used in this case, but that policies in this space need to be defined on a case-by-case basis.
SIG Auth may reconsider this position in the future, should a clear need for other profiles arise.
What's the difference between a security policy and a security context?
Security Contexts configure Pods and Containers at runtime. Security contexts are defined as part of the Pod and container specifications in the Pod manifest, and represent parameters to the container runtime.
Security policies are control plane mechanisms to enforce specific settings in the Security Context, as well as other parameters outside the Security Context. As of February 2020, the current native solution for enforcing these security policies is Pod Security Policy - a mechanism for centrally enforcing security policy on Pods across a cluster. Other alternatives for enforcing security policy are being developed in the Kubernetes ecosystem, such as OPA Gatekeeper.
What profiles should I apply to my Windows Pods?
Windows in Kubernetes has some limitations and differentiators from standard Linux-based workloads. Specifically, the Pod SecurityContext fields have no effect on Windows. As such, no standardized Pod Security profiles currently exists.
What about sandboxed Pods?
There is not currently an API standard that controls whether a Pod is considered sandboxed or not. Sandbox Pods may be identified by the use of a sandboxed runtime (such as gVisor or Kata Containers), but there is no standard definition of what a sandboxed runtime is.
The protections necessary for sandboxed workloads can differ from others. For example, the need to restrict privileged permissions is lessened when the workload is isolated from the underlying kernel. This allows for workloads requiring heightened permissions to still be isolated.
Additionally, the protection of sandboxed workloads is highly dependent on the method of sandboxing. As such, no single recommended policy is recommended for all sandboxed workloads.