If you are a cluster operator looking to expand your grasp of Kubernetes, this page and its linked topics extend the information provided on the foundational cluster operator page. From this page you can get information on key Kubernetes tasks needed to manage a complete production cluster.
Work with ingress, networking, storage, and workloads
Introductions to Kubernetes typically discuss simple stateless applications. As you move into more complex development, testing, and production environments, you need to consider more complex cases:
Communication: Ingress and Networking
Storage: Volumes and PersistentVolumes
And how Pods work with scheduling, priority, disruptions:
Implement security best practices
Securing your cluster includes work beyond the scope of Kubernetes itself.
In Kubernetes, you configure access control:
You also configure authorization. That is, you determine not just how users and services authenticate to the API server, or whether they have access, but also what resources they have access to. Role-based access control (RBAC) is the recommended mechanism for controlling authorization to Kubernetes resources. Other authorization modes are available for more specific use cases.
You should create Secrets to hold sensitive data such as passwords, tokens, or keys. Be aware, however, that there are limitations to the protections that a Secret can provide. See the Risks section of the Secrets documentation.
Implement custom logging and monitoring
Monitoring the health and state of your cluster is important. Collecting metrics, logging, and providing access to that information are common needs. Kubernetes provides some basic logging structure, and you may want to use additional tools to help aggregate and analyze log data.
Start with the basics on Kubernetes logging to understand how containers do logging and common patterns. Cluster operators often want to add something to gather and aggregate those logs. See the following topics:
Like log aggregation, many clusters utilize additional software to help capture metrics and display them. There is an overview of tools at Tools for Monitoring Compute, Storage, and Network Resources. Kubernetes also supports a core metrics pipeline which can be used by Horizontal Pod Autoscaler with custom metrics.
Prometheus, which is another CNCF project, is a common choice to support capture and temporary collection of metrics. There are several options for installing Prometheus, including using the stable/prometheus helm chart, and CoreOS provides a prometheus operator and kube-prometheus, which adds on Grafana dashboards and common configurations.
A common configuration on Minikube and some Kubernetes clusters uses Heapster along with InfluxDB and Grafana. There is a walkthrough of how to install this configuration in your cluster. As of Kubernetes 1.9, the sig-instrumentation team is shifting away from an all-inclusive monitoring pattern with heapster, described in Prometheus vs. Heapster vs. Kubernetes Metrics APIs.
Hosted data analytics services such as Datadog also offer Kubernetes integration.