Kubernetes Blog

Kompose: a tool to go from Docker-compose to Kubernetes

November 22 2016

Editor’s note: Today’s post is by Sebastien Goasguen, Founder of Skippbox, showing a new tool to move from ‘docker-compose’ to Kubernetes.

At Skippbox, we developed kompose a tool to automatically transform your Docker Compose application into Kubernetes manifests. Allowing you to start a Compose application on a Kubernetes cluster with a single kompose up command. We’re extremely happy to have donated kompose to the Kubernetes Incubator. So here’s a quick introduction about it and some motivating factors that got us to develop it.

Docker is terrific for developers. It allows everyone to get started quickly with an application that has been packaged in a Docker image and is available on a Docker registry. To build a multi-container application, Docker has developed Docker-compose (aka Compose). Compose takes in a yaml based manifest of your multi-container application and starts all the required containers with a single command docker-compose up. However Compose only works locally or with a Docker Swarm cluster.

But what if you wanted to use something else than Swarm? Like Kubernetes of course.

The Compose format is not a standard for defining distributed applications. Hence you are left re-writing your application manifests in your container orchestrator of choice.

We see kompose as a terrific way to expose Kubernetes principles to Docker users as well as to easily migrate from Docker Swarm to Kubernetes to operate your applications in production.

Over the summer, Kompose has found a new gear with help from Tomas Kral and Suraj Deshmukh from Red Hat, and Janet Kuo from Google. Together with our own lead kompose developer Nguyen An-Tu they are making kompose even more exciting. We proposed Kompose to the Kubernetes Incubator within the SIG-apps and we received approval from the general Kubernetes community; you can now find kompose in the Kubernetes Incubator.

Kompose now supports Docker-compose v2 format, persistent volume claims have been added recently, as well as multiple container per pods. It can also be used to target Openshift deployments, by specifying a different provider than the default Kubernetes. Kompose is also now available in Fedora packages and we look forward to see it in CentOS distributions in the coming weeks.

kompose is a single Golang binary that you build or install from the release on GitHub. Let’s skip the build instructions and dive straight into an example.

Let’s take it for a spin!

Guestbook application with Docker

The Guestbook application has become the canonical example for Kubernetes. In Docker-compose format, the guestbook can be started with this minimal file:

version: "2"



services:

  redis-master:

    image: gcr.io/google\_containers/redis:e2e

    ports:

      - "6379"

  redis-slave:

    image: gcr.io/google\_samples/gb-redisslave:v1

    ports:

      - "6379"

    environment:

      - GET\_HOSTS\_FROM=dns

  frontend:

    image: gcr.io/google-samples/gb-frontend:v4

    ports:

      - "80:80"

    environment:

      - GET\_HOSTS\_FROM=dns

It consists of three services. A redis-master node, a set of redis-slave that can be scaled and find the redis-master via its DNS name. And a PHP frontend that exposes itself on port 80. The resulting application allows you to leave short messages which are stored in the redis cluster.

To get it started with docker-compose on a vanilla Docker host do:

$ docker-compose -f docker-guestbook.yml up -d

Creating network "examples\_default" with the default driver

Creating examples\_redis-slave\_1

Creating examples\_frontend\_1

Creating examples\_redis-master\_1

So far so good, this is plain Docker usage. Now let’s see how to get this on Kubernetes without having to re-write anything.

Guestbook with ‘kompose’

Kompose currently has three main commands up, down and convert. Here for simplicity we will show a single usage to bring up the Guestbook application.

Similarly to docker-compose, we can use the kompose up command pointing to the Docker-compose file representing the Guestbook application. Like so:

$ kompose -f ./examples/docker-guestbook.yml up

We are going to create Kubernetes deployment and service for your dockerized application.

If you need more kind of controllers, use 'kompose convert' and 'kubectl create -f' instead.



INFO[0000] Successfully created service: redis-master

INFO[0000] Successfully created service: redis-slave

INFO[0000] Successfully created service: frontend

INFO[0000] Successfully created deployment: redis-master

INFO[0000] Successfully created deployment: redis-slave

INFO[0000] Successfully created deployment: frontend



Application has been deployed to Kubernetes. You can run 'kubectl get deployment,svc' for details.

kompose automatically converted the Docker-compose file into Kubernetes objects. By default, it created one deployment and one service per compose services. In addition it automatically detected your current Kubernetes endpoint and created the resources onto it. A set of flags can be used to generate Replication Controllers, Replica Sets or Daemon Sets instead of Deployments.

And that’s it! Nothing else to do, the conversion happened automatically.
Now, if you already now Kubernetes a bit, you’re familiar with the client kubectl and you can check what was created on your cluster.

$ kubectl get pods,svc,deployments

NAME                             READY        STATUS        RESTARTS     AGE

frontend-3780173733-0ayyx        1/1          Running       0            1m

redis-master-3028862641-8miqn    1/1          Running       0            1m

redis-slave-3788432149-t3ejp     1/1          Running       0            1m

NAME                             CLUSTER-IP   EXTERNAL-IP   PORT(S)      AGE

frontend                         10.0.0.34    \<none\>        80/TCP       1m

redis-master                     10.0.0.219   \<none\>        6379/TCP     1m

redis-slave                      10.0.0.84    \<none\>        6379/TCP     1m

NAME                             DESIRED      CURRENT       UP-TO-DATE



AVAILABLE   AGE

frontend                         1            1             1            1           1m

redis-master                     1            1             1            1           1m

redis-slave                      1            1             1            1           1m

Indeed you see the three services, the three deployments and the resulting three pods. To access the application quickly, access the frontend service locally and enjoy the Guestbook application, but this time started from a Docker-compose file.

kompose.png

Hopefully this gave you a quick tour of kompose and got you excited. They are more exciting features, like creating different type of resources, creating Helm charts and even using the experimental Docker bundle format as input. Check Lachlan Evenson’s blog on using a Docker bundle with Kubernetes. For an overall demo, see our talk from KubeCon

Head over to the Kubernetes Incubator and check out kompose, it will help you move easily from your Docker compose applications to Kubernetes clusters in production.

Kubernetes Containers Logging and Monitoring with Sematext

November 18 2016

Editor’s note: Today’s post is by Stefan Thies, Developer Evangelist, at Sematext, showing key Kubernetes metrics and log elements to help you troubleshoot and tune Docker and Kubernetes.

Managing microservices in containers is typically done with Cluster Managers and Orchestration tools. Each container platform has a slightly different set of options to deploy containers or schedule tasks on each cluster node. Because we do container monitoring and logging at Sematext, part of our job is to share our knowledge of these tools, especially as it pertains to container observability and devops. Today we’ll show a tutorial for Container Monitoring and Log Collection on Kubernetes.

Dynamic Deployments Require Dynamic Monitoring

The high level of automation for the container and microservice lifecycle makes the monitoring of Kubernetes more challenging than in more traditional, more static deployments. Any static setup to monitor specific application containers would not work because Kubernetes makes its own decisions according to the defined deployment rules. It is not only the deployed microservices that need to be monitored. It is equally important to watch metrics and logs for Kubernetes core services themselves, such as Kubernetes Master running etcd, controller-manager, scheduler and apiserver and Kubernetes Workers (fka minions) running kubelet and proxy service. Having a centralized place to keep an eye on all these services, their metrics and logs helps one spot problems in the cluster infrastructure. Kubernetes core services could be installed on bare metal, in virtual machines or as containers using Docker. Deploying Kubernetes core services in containers could be helpful with deployment and monitoring operations - tools for container monitoring would cover both core services and application containers. So how does one monitor such a complex and dynamic environment?

Agent for Kubernetes Metrics and Logs

There are a number of open source docker monitoring and logging projects one can cobble together to build a monitoring and log collection system (or systems). The advantage is that the code is all free. The downside is that this takes times - both initially when setting it up and later when maintaining. That’s why we built Sematext Docker Agent - a modern, Docker-aware metrics, events, and log collection agent. It runs as a tiny container on every Docker host and collects logs, metrics and events for all cluster nodes and all containers. It discovers all containers (one pod might contain multiple containers) including containers for Kubernetes core services, if core services are deployed in Docker containers. Let’s see how to deploy this agent.

**Deploying Agent to all Kubernetes Nodes **

Kubernetes provides DeamonSets, which ensure pods are added to nodes as nodes are added to the cluster. We can use this to easily deploy Sematext Agent to each cluster node!
Configure Sematext Docker Agent for Kubernetes
Let’s assume you’ve created an SPM app for your Kubernetes metrics and events, and a Logsene app for your Kubernetes logs, each of which comes with its own token. The Sematext Docker Agent README lists all configurations (e.g. filter for specific pods/images/containers), but we’ll keep it simple here.

  • Grab the latest sematext-agent-daemonset.yml (raw plain-text) template (also shown below)
  • Save it somewhere on disk
  • Replace the SPM_TOKEN and LOGSENE_TOKEN placeholders with your SPM and Logsene App tokens
apiVersion: extensions/v1beta1  
kind: DaemonSet  
metadata:  
  name: sematext-agent  
spec:  
  template:  
    metadata:  
      labels:  
        app: sematext-agent  
    spec:  
      selector: {}  
      dnsPolicy: "ClusterFirst"  
      restartPolicy: "Always"  
      containers:  
      - name: sematext-agent  
        image: sematext/sematext-agent-docker:latest  
        imagePullPolicy: "Always"  
        env:  
        - name: SPM\_TOKEN  
          value: "REPLACE THIS WITH YOUR SPM TOKEN"  
        - name: LOGSENE\_TOKEN  
          value: "REPLACE THIS WITH YOUR LOGSENE TOKEN"  
        - name: KUBERNETES  
          value: "1"  
        volumeMounts:  
          - mountPath: /var/run/docker.sock  
            name: docker-sock  
          - mountPath: /etc/localtime  
            name: localtime  
      volumes:  
        - name: docker-sock  
          hostPath:  
            path: /var/run/docker.sock  
        - name: localtime  
          hostPath:  
            path: /etc/localtime

Run Agent as DaemonSet

Activate Sematext Agent Docker with kubectl:

\> kubectl create -f sematext-agent-daemonset.yml

daemonset "sematext-agent-daemonset" created

Now let’s check if the agent got deployed to all nodes:

\> kubectl get pods

NAME                   READY     STATUS              RESTARTS   AGE

sematext-agent-nh4ez   0/1       ContainerCreating   0          6s

sematext-agent-s47vz   0/1       ImageNotReady       0          6s

The status “ImageNotReady” or “ContainerCreating” might be visible for a short time because Kubernetes must download the image for sematext/sematext-agent-docker first. The setting imagePullPolicy: “Always” specified in sematext-agent-daemonset.yml makes sure that Sematext Agent gets updated automatically using the image from Docker-Hub.

If we check again we’ll see Sematext Docker Agent got deployed to (all) cluster nodes:

\> kubectl get pods -l sematext-agent

NAME                   READY     STATUS    RESTARTS   AGE

sematext-agent-nh4ez   1/1       Running   0          8s

sematext-agent-s47vz   1/1       Running   0          8s

Less than a minute after the deployment you should see your Kubernetes metrics and logs! Below are screenshots of various out of the box reports and explanations of various metrics’ meanings.

Interpretation of Kubernetes Metrics

The metrics from all Kubernetes nodes are collected in a single SPM App, which aggregates metrics on several levels:

  • Cluster - metrics aggregated over all nodes displayed in SPM overview
  • Host / node level - metrics aggregated per node
  • Docker Image level - metrics aggregated by image name, e.g. all nginx webserver containers
  • Docker Container level - metrics aggregated for a single container
Host and Container Metrics from the Kubernetes Cluster

Each detailed chart has filter options for Node, Docker Image, and Docker Container. As Kubernetes uses the pod name in the name of the Docker Containers a search by pod name in the Docker Container filter makes it easy to select all containers for a specific pod.

Let’s have a look at a few Kubernetes (and Docker) key metrics provided by SPM.

Host Metrics such as CPU, Memory and Disk space usage. Docker images and containers consume more disk space than regular processes installed on a host. For example, an application image might include a Linux operating system and might have a size of 150-700 MB depending on the size of the base image and installed tools in the container. Data containers consume disk space on the host as well. In our experience watching the disk space and using cleanup tools is essential for continuous operations of Docker hosts.

Container count - represents the number of running containers per host

Container Counters per Kubernetes Node over time

Container Memory and Memory Fail Counters. These metrics are important to watch and very important to tune applications. Memory limits should fit the footprint of the deployed pod (application) to avoid situations where Kubernetes uses default limits (e.g. defined for a namespace), which could lead to OOM kills of containers. Memory fail counters reflect the number of failed memory allocations in a container, and in case of OOM kills a Docker Event is triggered. This event is then displayed in SPM because Sematext Docker Agents collects all Docker Events. The best practice is to tune memory setting in a few iterations:

  • Monitor memory usage of the application container
  • Set memory limits according to the observations
  • Continue monitoring of memory, memory fail counters, and Out-Of-Memory events. If OOM events happen, the container memory limits may need to be increased, or debugging is required to find the reason for the high memory consumptions.
Container memory usage, limits and fail counters

Container CPU usage and throttled CPU time. The CPU usage can be limited by CPU shares - unlike memory, CPU usage it is not a hard limit. Containers might use more CPU as long the resource is available, but in situations where other containers need the CPU limits apply and the CPU gets throttled to the limit.

There are more docker metrics to watch, like disk I/O throughput, network throughput and network errors for containers, but let’s continue by looking at Kubernetes Logs next.

Understand Kubernetes Logs

Kubernetes containers’ logs are not much different from Docker container logs. However, Kubernetes users need to view logs for the deployed pods. That’s why it is very useful to have Kubernetes-specific information available for log search, such as:

  • Kubernetes name space
  • Kubernetes pod name
  • Kubernetes container name
  • Docker image name
  • Kubernetes UID

Sematext Docker Agent extracts this information from the Docker container names and tags all logs with the information mentioned above. Having these data extracted in individual fields makes it is very easy to watch logs of deployed pods, build reports from logs, quickly narrow down to problematic pods while troubleshooting, and so on! If Kubernetes core components (such as kubelet, proxy, api server) are deployed via Docker the Sematext Docker Agent will collect Kubernetes core components logs as well.

All logs from Kubernetes containers in Logsene

There are many other useful features Logsene and Sematext Docker Agent give you out of the box, such as:

  • Automatic format detection and parsing of logs

    • Sematext Docker Agent includes patterns to recognize and parse many log formats
  • Custom pattern definitions for specific images and application types
  • Automatic Geo-IP enrichment for container logs
  • Filtering logs e.g. to exclude noisy services
  • Masking of sensitive data in specific log fields (phone numbers, payment information, authentication tokens)
  • Alerts and scheduled reports based on logs
  • Analytics for structured logs e.g. in Kibana or Grafana

Most of those topics are described in our Docker Log Management post and are relevant for Kubernetes log management as well. If you want to learn more about Docker monitoring, read more on our blog.

–Stefan Thies, Developer Evangelist, at Sematext

Visualize Kubelet Performance with Node Dashboard

November 17 2016

In Kubernetes 1.4, we introduced a new node performance analysis tool, called the node performance dashboard, to visualize and explore the behavior of the Kubelet in much richer details. This new feature will make it easy to understand and improve code performance for Kubelet developers, and lets cluster maintainer set configuration according to provided Service Level Objectives (SLOs).

Background

A Kubernetes cluster is made up of both master and worker nodes. The master node manages the cluster’s state, and the worker nodes do the actual work of running and managing pods. To do so, on each worker node, a binary, called Kubelet, watches for any changes in pod configuration, and takes corresponding actions to make sure that containers run successfully. High performance of the Kubelet, such as low latency to converge with new pod configuration and efficient housekeeping with low resource usage, is essential for the entire Kubernetes cluster. To measure this performance, Kubernetes uses end-to-end (e2e) tests to continuously monitor benchmark changes of latest builds with new features.

Kubernetes SLOs are defined by the following benchmarks :

* API responsiveness : 99% of all API calls return in less than 1s.
* Pod startup time : 99% of pods and their containers (with pre-pulled images) start within 5s.

Prior to 1.4 release, we’ve only measured and defined these at the cluster level, opening up the risk that other factors could influence the results. Beyond these, we also want to have more performance related SLOs such as the maximum number of pods for a specific machine type allowing maximum utilization of your cluster. In order to do the measurement correctly, we want to introduce a set of tests isolated to just a node’s performance. In addition, we aim to collect more fine-grained resource usage and operation tracing data of Kubelet from the new tests.

Data Collection

The node specific density and resource usage tests are now added into e2e-node test set since 1.4. The resource usage is measured by a standalone cAdvisor pod for flexible monitoring interval (comparing with Kubelet integrated cAdvisor). The performance data, such as latency and resource usage percentile, are recorded in persistent test result logs. The tests also record time series data such as creation time, running time of pods, as well as real-time resource usage. Tracing data of Kubelet operations are recorded in its log stored together with test results.

Node Performance Dashboard

Since Kubernetes 1.4, we are continuously building the newest Kubelet code and running node performance tests. The data is collected by our new performance dashboard available at node-perf-dash.k8s.io. Figure 1 gives a preview of the dashboard. You can start to explore it by selecting a test, either using the drop-down list of short test names (region (a)) or by choosing test options one by one (region (b)). The test details show up in region (c) containing the full test name from Ginkgo (the Go test framework used by Kubernetes). Then select a node type (image and machine) in region (d).

Figure 1. Select a test to display in node performance dashboard.

The “BUILDS” page exhibits the performance data across different builds (Figure 2). The plots include pod startup latency, pod creation throughput, and CPU/memory usage of Kubelet and runtime (currently Docker). In this way it’s easy to monitor the performance change over time as new features are checked in.

Figure 2. Performance data across different builds.

Compare Different Node Configurations

It’s always interesting to compare the performance between different configurations, such as comparing startup latency of different machine types, different numbers of pods, or comparing resource usage of hosting different number of pods. The dashboard provides a convenient way to do this. Just click the “Compare it” button the right up corner of test selection menu (region (e) in Figure 1). The selected tests will be added to a comparison list in the “COMPARISON” page, as shown in Figure 3. Data across a series of builds are aggregated to a single value to facilitate comparison and are displayed in bar charts.

Figure 3. Compare different test configurations.

Time Series and Tracing: Diving Into Performance Data

Pod startup latency is an important metric for Kubelet, especially when creating a large number of pods per node. Using the dashboard you can see the change of latency, for example, when creating 105 pods, as shown in Figure 4. When you see the highly variable lines, you might expect that the variance is due to different builds. However, as these test here were run against the same Kubernetes code, we can conclude the variance is due to performance fluctuation. The variance is close to 40s when we compare the 99% latency of build #162 and #173, which is very large. To drill into the source of the fluctuation, let’s check out the “TIME SERIES” page.

Figure 4. Pod startup latency when creating 105 pods.

Looking specifically at build #162, we are able to see that the tracing data plotted in the pod creation latency chart (Figure 5). Each curve is an accumulated histogram of the number of pod operations which have already arrive at a certain tracing probe. The timestamp of tracing pod is either collected from the performance tests or by parsing the Kubelet log. Currently we collect the following tracing data:

  • “create” (in test): the test creates pods through API client;
  • “running” (in test): the test watches that pods are running from API server;
  • “pod_config_change”: pod config change detected by Kubelet SyncLoop;
  • “runtime_manager”: runtime manager starts to create containers;
  • “infra_container_start”: the infra container of a pod starts;
  • “container_start’: the container of a pod starts;
  • “pod_running”: a pod is running;
  • “pod_status_running”: status manager updates status for a running pod;

The time series chart illustrates that it is taking a long time for the status manager to update pod status (the data of “running” is not shown since it overlaps with “pod_status_running”). We figure out this latency is introduced due to the query per second (QPS) limits of Kubelet to the API server (default is 5). After being aware of this, we find in additional tests that by increasing QPS limits, curve “running” gradually converges with “pod_running’, and results in much lower latency. Therefore the previous e2e test pod startup results reflect the combined latency of both Kubelet and time of uploading status, the performance of Kubelet is thus under-estimated.

Figure 5. Time series page using data from build #162.

Further, by comparing the time series data of build #162 (Figure 5) and build #173 (Figure 6), we find that the performance pod startup latency fluctuation actually happens during updating pod statuses. Build #162 has several straggler “pod_status_running” events with a long latency tails. It thus provides useful ideas for future optimization. 

Figure 6. Pod startup latency of build #173.

In future we plan to use events in Kubernetes which has a fixed log format to collect tracing data more conveniently. Instead of extracting existing log entries, then you can insert your own tracing probes inside Kubelet and obtain the break-down latency of each segment. 

You can check the latency between any two probes across different builds in the “TRACING” page, as shown in Figure 7. For example, by selecting “pod_config_change” as the start probe, and “pod_status_running’ as the end probe, it gives the latency variance of Kubelet over continuous builds without status updating overhead. With this feature, developers are able to monitor the performance change of a specific part of code inside Kubelet.

Figure 7. Plotting latency between any two probes.

Future Work

The node performance dashboard is a brand new feature. It is still alpha version under active development. We will keep optimizing the data collecting and visualization, providing more tests, metrics and tools to the developers and the cluster maintainers. 

Please join our community and help us build the future of Kubernetes! If you’re particularly interested in nodes or performance testing, participate by chatting with us in our Slack channel or join our meeting which meets every Tuesday at 10 AM PT on this SIG-Node Hangout.

–Zhou Fang, Software Engineering Intern, Google

CNCF Partners With The Linux Foundation To Launch New Kubernetes Certification, Training and Managed Service Provider Program

November 08 2016

Today the CNCF is pleased to launch a new training, certification and Kubernetes Managed Service Provider (KMSP) program. 

The goal of the program is to ensure enterprises get the support they’re looking for to get up to speed and roll out new applications more quickly and more efficiently. The Linux Foundation, in partnership with CNCF, will develop and operate the Kubernetes training and certification.

Interested in this course? Sign up here to pre-register. The course, expected to be available in early 2017, is open now at the discounted price of $99 (regularly $199) for a limited time, and the certification program is expected to be available in the second quarter of 2017. 

The KMSP program is a pre-qualified tier of highly vetted service providers who have deep experience helping enterprises successfully adopt Kubernetes. The KMSP partners offer SLA-backed Kubernetes support, consulting, professional services and training for organizations embarking on their Kubernetes journey. In contrast to the Kubernetes Service Partners program outlined recently in this blog, to become a Kubernetes Managed Service Provider the following additional requirements must be met: three or more certified engineers, an active contributor to Kubernetes, and a business model to support enterprise end users. 

As part of the program, a new CNCF Certification Working Group is starting up now. The group will help define the program’s open source curriculum, which will be available under the Creative Commons By Attribution 4.0 International license for anyone to use. Any Kubernetes expert can join the working group via this link. Google has committed to assist, and many others, including Apprenda, Container Solutions, CoreOS, Deis and Samsung SDS, have expressed interest in participating in the Working Group.

To learn more about the new program and the first round of KMSP partners that we expect to grow weekly, check out today’s announcement here.

Modernizing the Skytap Cloud Micro-Service Architecture with Kubernetes

November 07 2016

Editor’s note: Today’s guest post is by the Tools and Infrastructure Engineering team at Skytap, a public cloud provider focused on empowering DevOps workflows, sharing their experience on adopting Kubernetes. 

Skytap is a global public cloud that provides our customers the ability to save and clone complex virtualized environments in any given state. Our customers include enterprise organizations running applications in a hybrid cloud, educational organizations providing virtual training labs, users who need easy-to-maintain development and test labs, and a variety of organizations with diverse DevOps workflows.

Some time ago, we started growing our business at an accelerated pace — our user base and our engineering organization continue to grow simultaneously. These are exciting, rewarding challenges! However, it’s difficult to scale applications and organizations smoothly, and we’re approaching the task carefully. When we first began looking at improvements to scale our toolset, it was very clear that traditional OS virtualization was not going to be an effective way to achieve our scaling goals. We found that the persistent nature of VMs encouraged engineers to build and maintain bespoke ‘pet’ VMs; this did not align well with our desire to build reusable runtime environments with a stable, predictable state. Fortuitously, growth in the Docker and Kubernetes communities has aligned with our growth, and the concurrent explosion in community engagement has (from our perspective) helped these tools mature.

In this article we’ll explore how Skytap uses Kubernetes as a key component in services that handle production workloads growing the Skytap Cloud.

As we add engineers, we want to maintain our agility and continue enabling ownership of components throughout the software development lifecycle. This requires a lot of modularization and consistency in key aspects of our process. Previously, we drove reuse with systems-level packaging through our VM and environment templates, but as we scale, containers have become increasingly important as a packaging mechanism due to their comparatively lightweight and precise control of the runtime environment. 

In addition to this packaging flexibility, containers help us establish more efficient resource utilization, and they head off growing complexity arising from the natural inclination of teams to mix resources into large, highly-specialized VMs. For example, our operations team would install tools for monitoring health and resource utilization, a development team would deploy a service, and the security team might install traffic monitoring; combining all of that into a single VM greatly increases the test burden and often results in surprises—oops, you pulled in a new system-level Ruby gem!

Containerization of individual components in a service is pretty trivial with Docker. Getting started is easy, but as anyone who has built a distributed system with more than a handful of components knows, the real difficulties are deployment, scaling, availability, consistency, and communication between each unit in the cluster.

Let’s containerize! 

We’d begun to trade a lot of our heavily-loved pet VMs for, as the saying goes, cattle.

_____
/ Moo \
\---- /
       \   ^__^
        \  (oo)\_______
           (__)\       )\/\
               ||----w |
               ||     ||

The challenges of distributed systems aren’t simplified by creating a large herd of free-range containers, though. When we started using containers, we recognized the need for a container management framework. We evaluated Docker Swarm, Mesosphere, and Kubernetes, but we found that the Mesosphere usage model didn’t match our needs — we need the ability to manage discrete VMs; this doesn’t match the Mesosphere ‘distributed operating system’ model — and Docker Swarm was still not mature enough. So, we selected Kubernetes.  

Launching Kubernetes and building a new distributed service is relatively easy (inasmuch as this can be said for such a service: you can’t beat CAP theorem). However, we need to integrate container management with our existing platform and infrastructure. Some components of the platform are better served by VMs, and we need the ability to containerize services iteratively. 

We broke this integration problem down into four categories: 

  1. 1.Service control and deployment
  2. 2.Inter-service communication
  3. 3.Infrastructure integration
  4. 4.Engineering support and education

Service Control and Deployment

We use a custom extension of Capistrano (we call it ‘Skycap’) to deploy services and manage those services at runtime. It is important for us to manage both containerized and classic services through a single, well-established framework. We also need to isolate Skycap from the inevitable breaking changes inherent in an actively-developed tool like Kubernetes. 

To handle this, we use wrappers in to our service control framework that isolate kubectl behind Skycap and handle issues like ignoring spurious log messages.

Deployment adds a layer of complexity for us. Docker images are a great way to package software, but historically, we’ve deployed from source, not packages. Our engineering team expects that making changes to source is sufficient to get their work released; devs don’t expect to handle additional packaging steps. Rather than rebuild our entire deployment and orchestration framework for the sake of containerization, we use a continuous integration pipeline for our containerized services. We automatically build a new Docker image for every commit to a project, and then we tag it with the Mercurial (Hg) changeset number of that commit. On the Skycap side, a deployment from a specific Hg revision will then pull the Docker images that are tagged with that same revision number. 

We reuse container images across multiple environments. This requires environment-specific configuration to be injected into each container instance. Until recently, we used similar source-based principles to inject these configuration values: each container would copy relevant configuration files from Hg by cURL-ing raw files from the repo at run time. Network availability and variability are a challenge best avoided, though, so we now load the configuration into Kubernetes’ ConfigMap feature. This not only simplifies our Docker images, but it also makes pod startup faster and more predictable (because containers don’t have to download files from Hg).   

Inter-service communication

Our services communicate using two primary methods. The first, message brokering, is typical for process-to-process communication within the Skytap platform. The second is through direct point-to-point TCP connections, which are typical for services that communicate with the outside world (such as web services). We’ll discuss the TCP method in the next section, as a component of infrastructure integration. 

Managing direct connections between pods in a way that services can understand is complicated. Additionally, our containerized services need to communicate with classic VM-based services. To mitigate this complexity, we primarily use our existing message queueing system. This helped us avoid writing a TCP-based service discovery and load balancing system for handling traffic between pods and non-Kubernetes services. 

This reduces our configuration load—services only need to know how to talk to the message queues, rather than to every other service they need to interact with. We have additional flexibility for things like managing the run-state of pods; messages buffer in the queue while nodes are restarting, and we avoid the overhead of re-configuring TCP endpoints each time a pod is added or removed from the cluster. Furthermore, the MQ model allows us to manage load balancing with a more accurate ‘pull’ based approach, in which recipients determine when they are ready to process a new message, instead of using heuristics like ‘least connections’ that simply count the number of open sockets to estimate load.  

Migrating MQ-enabled services to Kubernetes is relatively straightforward compared to migrating services that use the complex TCP-based direct or load balanced connections. Additionally, the isolation provided by the message broker means that the switchover from a classic service to a container-based service is essentially transparent to any other MQ-enabled service. 

Infrastructure Integration

As an infrastructure provider, we face some unique challenges in configuring Kubernetes for use with our platform. AWS & GCP provide out-of-box solutions that simplify Kubernetes provisioning but make assumptions about the underlying infrastructure that do not match our reality. Some organizations have purpose-built data centers. This option would have required us to abandon our existing load balancing infrastructure, our Puppet based provisioning system and the expertise we’d built up around these tools. We weren’t interested in abandoning the tools or our vested experience, so we needed a way to manage Kubernetes that could integrate with our world instead of rebuild it.

So, we use Puppet to provision and configure VMs that, in turn, run the Skytap Platform. We wrote custom deployment scripts to install Kubernetes on these, and we coordinate with our operations team to do capacity planning for Kube-master and Kube-node hosts. 

In the previous section, we mentioned point-to-point TCP-based communication. For customer-facing services, the pods need a way to interface with Skytap’s layer 3 network infrastructure. Examples at Skytap include our web applications and API over HTTPS, Remote Desktop over Web Sockets, FTP, TCP/UDP port forwarding services, full public IPs, etc. We need careful management of network ingress and egress for this external traffic, and have historically used F5 load balancers. The MQ infrastructure for internal services is inadequate for handling this workload because the protocols used by various clients (like web browsers) are very specific and TCP is the lowest common denominator.

To get our load balancers communicating with our Kubernetes pods, we run the kube-proxy on each node. Load balancers route to the node, and kube-proxy handles the final handoff to the appropriate pod.

We mustn’t forget that Kubernetes needs to route traffic between pods (for both TCP-based and MQ-based messaging). We use the Calico plugin for Kubernetes networking, with a specialized service to reconfigure the F5 when Kubernetes launches or reaps pods. Calico handles route advertisement with BGP, which eases integration with the F5.

F5s also need to have their load balancing pool reconfigured when pods enter or leave the cluster. The F5 appliance maintains a pool of load-balanced back-ends; ingress to a containerized service is directed through this pool to one of the nodes hosting a service pod. This is straightforward for static network configurations – but since we’re using Kubernetes to manage pod replication and availability, our networking situation becomes dynamic. To handle changes, we have a ‘load balancer’ pod that monitors the Kubernetes svc object for changes; if a pod is removed or added, the ‘load balancer’ pod will detect this change through the svc object, and then update the F5 configuration through the appliance’s web API. This way, Kubernetes transparently handles replication and failover/recovery, and the dynamic load balancer configuration lets this process remain invisible to the service or user who originated the request. Similarly, the combination of the Calico virtual network plus the F5 load balancer means that TCP connections should behave consistently for services that are running on both the traditional VM infrastructure, or that have been migrated to containers. 

kubernetes_f5_messaging.png

With dynamic reconfiguration of the network, the replication mechanics of Kubernetes make horizontal scaling and (most) failover/recovery very straightforward. We haven’t yet reached the reactive scaling milestone, but we’ve laid the groundwork with the Kubernetes and Calico infrastructure, making one avenue to implement it straightforward:

  • Configure upper and lower bounds for service replication
  • Build a load analysis and scaling service (easy, right?)
  • If load patterns match the configured triggers in the scaling service (for example, request rate or volume above certain bounds), issue: kubectl scale –replicas=COUNT rc NAME

This would allow us fine-grained control of autoscaling at the platform level, instead of from the applications themselves – but we’ll also evaluate Horizontal Pod Autoscaling in Kubernetes; which may suit our need without a custom service. 

Keep an eye on our GitHub account and the Skytap blog; as our solutions to problems like these mature, we hope to share what we’ve built with the open source community.

Engineering Support

A transition like our containerization project requires the engineers involved in maintaining and contributing to the platform change their workflow and learn new methods for creating and troubleshooting services. 

Because a variety of learning styles require a multi-faceted approach, we handle this in three ways: with documentation, with direct outreach to engineers (that is, brownbag sessions or coaching teams), and by offering easy-to-access, ad-hoc support.  

We continue to curate a collection of documents that provide guidance on transitioning classic services to Kubernetes, creating new services, and operating containerized services. Documentation isn’t for everyone, and sometimes it’s missing or incomplete despite our best efforts, so we also run an internal #kube-help Slack channel, where anyone can stop in for assistance or arrange a more in-depth face-to-face discussion.

We have one more powerful support tool: we automatically construct and test prod-like environments that include this Kubernetes infrastructure, which allows engineers a lot of freedom to experiment and work with Kubernetes hands-on. We explore the details of automated environment delivery in more detail in this post.

Final Thoughts

We’ve had great success with Kubernetes and containerization in general, but we’ve certainly found that integrating with an existing full-stack environment has presented many challenges. While not exactly plug-and-play from an enterprise lifecycle standpoint, the flexibility and configurability of Kubernetes still remains a very powerful tool for building our modularized service ecosystem.

We love application modernization challenges. The Skytap platform is well suited for these sorts of migration efforts – we run Skytap in Skytap, of course, which helped us tremendously in our Kubernetes integration project. If you’re planning modernization efforts of your own, connect with us, we’re happy to help.

–Shawn Falkner-Horine and Joe Burchett, Tools and Infrastructure Engineering, Skytap

Bringing Kubernetes Support to Azure Container Service

November 07 2016

Editor’s note: Today’s post is by Brendan Burns, Partner Architect, at Microsoft & Kubernetes co-founder talking about bringing Kubernetes to Azure Container Service.

With more than a thousand people coming to KubeCon in my hometown of Seattle, nearly three years after I helped start the Kubernetes project, it’s amazing and humbling to see what a small group of people and a radical idea have become after three years of hard work from a large and growing community. In July of 2014, scarcely a month after Kubernetes became publicly available, Microsoft announced its initial support for Azure. The release of Kubernetes 1.4, brought support for native Microsoft networking, load-balancer and disk integration

Today, Microsoft announced the next step in Kubernetes on Azure: the introduction of Kubernetes as a supported orchestrator in Azure Container Service (ACS). It’s been really exciting for me to join the ACS team and help build this new addition. The integration of Kubernetes into ACS means that with a few clicks in the Azure portal, or by running a single command in the new python-based Azure command line tool, you will be able to create a fully functional Kubernetes cluster that is integrated with the rest of your Azure resources.

Kubernetes is availabe in public preview in Azure Container Service today. Community participation has always been an important part of the Kubernetes experience. Over the next few months, I hope you’ll join us and provide your feedback on the experience as we bring it to general availability.

In the spirit of community, we are also excited to announce a new open source project: ACS Engine. The goal of ACS Engine is to provide an open, community driven location to develop and share best practices for orchestrating containers on Azure. All of our knowledge of running containers in Azure has been captured in that repository, and we look forward to improving and extending it as we move forward with the community. Going forward, the templates in ACS Engine will be the basis for clusters deployed via the ACS API, and thus community driven improvements, features and more will have a natural path into the Azure Container Service. We’re excited to invite you to join us in improving ACS. Prior to the creation of ACS Engine, customers with unique requirements not supported by the ACS API needed to maintain variations on our templates. While these differences start small, they grew larger over time as the mainline template was improved and users also iterated their templates. These differences and drift really impact the ability for users to collaborate, since their templates are all different. Without the ability to share and collaborate, it’s difficult to form a community since every user is siloed in their own variant.

To solve this problem, the core of ACS Engine is a template processor, built in Go, that enables you to dynamically combine different pieces of configuration together to form a final template that can be used to build up your cluster. Thus, each user can mix and match the pieces build the final container cluster that suits their needs. At the same time, each piece can be built and maintained collaboratively by the community. We’ve been beta testing this approach with some customers and the feedback we’ve gotten so far has been really positive.

Beyond services to help you run containers on Azure, I think it’s incredibly important to improve the experience of developing and deploying containerized applications to Kubernetes. To that end, I’ve been doing a bunch of work lately to build a Kubernetes extension for the really excellent, open source, Visual Studio Code. The Kubernetes extension enables you to quickly deploy JSON or YAML files you are editing onto a Kubernetes cluster. Additionally, it enables you to import existing Kubernetes objects into Code for easy editing. Finally, it enables synchronization between your running containers and the source code that you are developing for easy debugging of issues you are facing in production.

But really, a demo is worth a thousand words, so please have a look at this video:

Of course, like everything else in Kubernetes it’s released as open source, and I look forward to working on it further with the community. Thanks again, I look forward to seeing everyone at the OpenShift Gathering today, as well as at the Microsoft Azure booth during KubeCon tomorrow and Wednesday. Welcome to Seattle!

Tail Kubernetes with Stern

October 31 2016

Editor’s note: today’s post is by Antti Kupila, Software Engineer, at Wercker, about building a tool to tail multiple pods and containers on Kubernetes.

We love Kubernetes here at Wercker and build all our infrastructure on top of it. When deploying anything you need to have good visibility to what’s going on and logs are a first view into the inner workings of your application. Good old tail -f has been around for a long time and Kubernetes has this too, built right into kubectl.

I should say that tail is by no means the tool to use for debugging issues but instead you should feed the logs into a more persistent place, such as Elasticsearch. However, there’s still a place for tail where you need to quickly debug something or perhaps you don’t have persistent logging set up yet (such as when developing an app in Minikube).

Multiple Pods

Kubernetes has the concept of Replication Controllers which ensure that n pods are running at the same time. This allows rolling updates and redundancy. Considering they’re quite easy to set up there’s really no reason not to do so.

However now there are multiple pods running and they all have a unique id. One issue here is that you’ll need to know the exact pod id (kubectl get pods) but that changes every time a pod is created so you’ll need to do this every time. Another consideration is the fact that Kubernetes load balances the traffic so you won’t know at which pod the request ends up at. If you’re tailing pod A but the traffic ends up at pod B you’ll miss what happened.

Let’s say we have a pod called service with 3 replicas. Here’s what that would look like:

$ kubectl get pods                         # get pods to find pod ids

$ kubectl log -f service-1786497219-2rbt1  # pod 1

$ kubectl log -f service-1786497219-8kfbp  # pod 2

$ kubectl log -f service-1786497219-lttxd  # pod 3

Multiple containers

We’re heavy users gRPC for internal services and expose the gRPC endpoints over REST using gRPC Gateway. Typically we have server and gateway living as two containers in the same pod (same binary that sets the mode by a cli flag). The gateway talks to the server in the same pod and both ports are exposed to Kubernetes. For internal services we can talk directly to the gRPC endpoint while our website communicates using standard REST to the gateway.

This poses a problem though; not only do we now have multiple pods but we also have multiple containers within the pod. When this is the case the built-in logging of kubectl requires you to specify which containers you want logs from.

If we have 3 replicas of a pod and 2 containers in the pod you’ll need 6 kubectl log -f <pod id> <container id>. We work with big monitors but this quickly gets out of hand…

If our service pod has a server and gateway container we’d be looking at something like this:

$ kubectl get pods                                 # get pods to find pod ids

$ kubectl describe pod service-1786497219-2rbt1    # get containers in pod

$ kubectl log -f service-1786497219-2rbt1 server   # pod 1

$ kubectl log -f service-1786497219-2rbt1 gateway  # pod 1

$ kubectl log -f service-1786497219-8kfbp server   # pod 2

$ kubectl log -f service-1786497219-8kfbp gateway  # pod 2

$ kubectl log -f service-1786497219-lttxd server   # pod 3

$ kubectl log -f service-1786497219-lttxd gateway  # pod 3

Stern

To get around this we built Stern. It’s a super simple utility that allows you to specify both the pod id and the container id as regular expressions. Any match will be followed and the output is multiplexed together, prefixed with the pod and container id, and color-coded for human consumption (colors are stripped if piping to a file).

Here’s how the service example would look:

$ stern service

This will match any pod containing the word service and listen to all containers within it. If you only want to see traffic to the server container you could do stern –container server service and it’ll stream the logs of all the server containers from the 3 pods.

The output would look something like this:

$ stern service

+ service-1786497219-2rbt1 › server

+ service-1786497219-2rbt1 › gateway

+ service-1786497219-8kfbp › server

+ service-1786497219-8kfbp › gateway

+ service-1786497219-lttxd › server

+ service-1786497219-lttxd › gateway

+ service-1786497219-8kfbp server Log message from server

+ service-1786497219-2rbt1 gateway Log message from gateway

+ service-1786497219-8kfbp gateway Log message from gateway

+ service-1786497219-lttxd gateway Log message from gateway

+ service-1786497219-lttxd server Log message from server

+ service-1786497219-2rbt1 server Log message from server

In addition, if a pod is killed and recreated during a deployment Stern will stop listening to the old pod and automatically hook into the new one. There’s no more need to figure out what the id of that newly created pod is.

Configuration options

Stern was deliberately designed to be minimal so there’s not much to it. However, there are still a couple configuration options we can highlight here. They’re very similar to the ones built into kubectl so if you’re familiar with that you should feel right at home.

  • timestamps adds the timestamp to each line
  • since shows log entries since a certain time (for instance –since 15min)
  • kube-config allows you to specify another Kubernetes config. Defaults to ~/.kube/config
  • namespace allows you to only limit the search to a certain namespaceRun stern –help for all options.

Examples

Tail the gateway container running inside of the envvars pod on staging

 + stern --context staging --container gateway envvars

Show auth activity from 15min ago with timestamps

+ stern -t --since 15m auth

Follow the development of some-new-feature in minikube

+ stern --context minikube some-new-feature

View pods from another namespace

+ stern --namespace kube-system kubernetes-dashboard

Get Stern

Stern is open source and available on GitHub, we’d love your contributions or ideas. If you don’t want to build from source you can also download a precompiled binary from GitHub releases.

@Kubernetesio View on Github #kubernetes-users Stack Overflow Download Kubernetes