Kubernetes Blog

Configuration management with Containers

April 04 2016

Editor’s note: this is our seventh post in a series of in-depth posts on what’s new in Kubernetes 1.2

A good practice when writing applications is to separate application code from configuration. We want to enable application authors to easily employ this pattern within Kubernetes. While the Secrets API allows separating information like credentials and keys from an application, no object existed in the past for ordinary, non-secret configuration. In Kubernetes 1.2, we’ve added a new API resource called ConfigMap to handle this type of configuration data.

The basics of ConfigMap

The ConfigMap API is simple conceptually. From a data perspective, the ConfigMap type is just a set of key-value pairs. Applications are configured in different ways, so we need to be flexible about how we let users store and consume configuration data. There are three ways to consume a ConfigMap in a pod:

  • Command line arguments
  • Environment variables
  • Files in a volume

These different methods lend themselves to different ways of modeling the data being consumed. To be as flexible as possible, we made ConfigMap hold both fine- and/or coarse-grained data. Further, because applications read configuration settings from both environment variables and files containing configuration data, we built ConfigMap to support either method of access. Let’s take a look at an example ConfigMap that contains both types of configuration:

apiVersion: v1

kind: ConfigMap

metadata:

  Name: example-configmap

data:

  # property-like keys

  game-properties-file-name: game.properties

  ui-properties-file-name: ui.properties

  # file-like keys

  game.properties: |

    enemies=aliens

    lives=3

    enemies.cheat=true

    enemies.cheat.level=noGoodRotten

    secret.code.passphrase=UUDDLRLRBABAS

    secret.code.allowed=true

    secret.code.lives=30

  ui.properties: |

    color.good=purple

    color.bad=yellow

    allow.textmode=true

    how.nice.to.look=fairlyNice

Users that have used Secrets will find it easy to begin using ConfigMap — they’re very similar. One major difference in these APIs is that Secret values are stored as byte arrays in order to support storing binaries like SSH keys. In JSON and YAML, byte arrays are serialized as base64 encoded strings. This means that it’s not easy to tell what the content of a Secret is from looking at the serialized form. Since ConfigMap is intended to hold only configuration information and not binaries, values are stored as strings, and thus are readable in the serialized form.

We want creating ConfigMaps to be as flexible as storing data in them. To create a ConfigMap object, we’ve added a kubectl command called kubectl create configmap that offers three different ways to specify key-value pairs:

  • Specify literal keys and value
  • Specify an individual file
  • Specify a directory to create keys for each file

These different options can be mixed, matched, and repeated within a single command:

    $ kubectl create configmap my-config \

    --from-literal=literal-key=literal-value \

    --from-file=ui.properties \
    --from=file=path/to/config/dir

Consuming ConfigMaps is simple and will also be familiar to users of Secrets. Here’s an example of a Deployment that uses the ConfigMap above to run an imaginary game server:

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

  name: configmap-example-deployment

  labels:

    name: configmap-example-deployment

spec:

  replicas: 1

  selector:

    matchLabels:

      name: configmap-example

  template:

    metadata:

      labels:

        name: configmap-example

    spec:

      containers:

      - name: game-container

        image: imaginarygame

        command: ["game-server", "--config-dir=/etc/game/cfg"]

        env:

        # consume the property-like keys in environment variables

        - name: GAME\_PROPERTIES\_NAME

          valueFrom:

            configMapKeyRef:

              name: example-configmap

              key: game-properties-file-name

        - name: UI\_PROPERTIES\_NAME

          valueFrom:

            configMapKeyRef:

              name: example-configmap

              key: ui-properties-file-name

        volumeMounts:

        - name: config-volume

          mountPath: /etc/game

      volumes:

      # consume the file-like keys of the configmap via volume plugin

      - name: config-volume

        configMap:

          name: example-configmap

          items:

          - key: ui.properties

            path: cfg/ui.properties

         - key: game.properties

           path: cfg/game.properties
      restartPolicy: Never

In the above example, the Deployment uses keys of the ConfigMap via two of the different mechanisms available. The property-like keys of the ConfigMap are used as environment variables to the single container in the Deployment template, and the file-like keys populate a volume. For more details, please see the ConfigMap docs.

We hope that these basic primitives are easy to use and look forward to seeing what people build with ConfigMaps. Thanks to the community members that provided feedback about this feature. Special thanks also to Tamer Tas who made a great contribution to the proposal and implementation of ConfigMap.

If you’re interested in Kubernetes and configuration, you’ll want to participate in:

And of course for more information about the project in general, go to www.kubernetes.io and follow us on Twitter @Kubernetesio.

Paul Morie, Senior Software Engineer, Red Hat

Using Deployment objects with Kubernetes 1.2

April 01 2016

Editor’s note: this is the seventh post in a series of in-depth posts on what’s new in Kubernetes 1.2

Kubernetes has made deploying and managing applications very straightforward, with most actions a single API or command line away, including rolling out new applications, canary testing and upgrading. So why would we need Deployments?

Deployment objects automate deploying and rolling updating applications. Compared with kubectl rolling-update, Deployment API is much faster, is declarative, is implemented server-side and has more features (for example, you can rollback to any previous revision even after the rolling update is done).

In today’s blogpost, we’ll cover how to use Deployments to:

  1. Deploy/rollout an application
  2. Update the application declaratively and progressively, without a service outage
  3. Rollback to a previous revision, if something’s wrong when you’re deploying/updating the application

Without further ado, let’s start playing around with Deployments!

Getting started

If you want to try this example, basically you’ll need 3 things:

  1. A running Kubernetes cluster : If you don’t already have one, check the Getting Started guides for a list of solutions on a range of platforms, from your laptop, to VMs on a cloud provider, to a rack of bare metal servers.
  2. Kubectl, the Kubernetes CLI : If you see a URL response after running kubectl cluster-info, you’re ready to go. Otherwise, follow the instructions to install and configure kubectl; or the instructions for hosted solutions if you have a Google Container Engine cluster.
  3. The configuration files for this demo. If you choose not to run this example yourself, that’s okay. Just watch this video to see what’s going on in each step.

Diving in

The configuration files contain a static website. First, we want to start serving its static content. From the root of the Kubernetes repository, run:

$ kubectl proxy --www=docs/user-guide/update-demo/local/ &  

Starting to serve on …

This runs a proxy on the default port 8001. You may now visit http://localhost:8001/static/ the demo website (and it should be a blank page for now). Now we want to run an app and show it on the website.

$ kubectl run update-demo   
--image=gcr.io/google\_containers/update-demo:nautilus --port=80 -l name=update-demo  

deployment “update-demo” created  

This deploys 1 replica of an app with the image “update-demo:nautilus” and you can see it visually on http://localhost:8001/static/.1

The card showing on the website represents a Kubernetes pod, with the pod’s name (ID), status, image, and labels.

Getting bigger

Now we want more copies of this app!
$ kubectl scale deployment/update-demo –replicas=4
deployment “update-demo” scaled

Updating your application

How about updating the app?

 $ kubectl edit deployment/update-demo  

 This opens up your default editor, and you can update the deployment on the fly. Find .spec.template.spec.containers[0].image and change nautilus to kitty. Save the file, and you’ll see:  

 deployment "update-demo" edited   

You’re now updating the image of this app from “update-demo:nautilus” to “update-demo:kitty”. Deployments allow you to update the app progressively, without a service outage.

After a while, you’ll find the update seems stuck. What happened?

Debugging your rollout

If you look closer, you’ll find that the pods with the new “kitty” tagged image stays pending. The Deployment automatically stops the rollout if it’s failing. Let’s look at one of the new pod to see what happened:

$ kubectl describe pod/update-demo-1326485872-a4key  

Looking at the events of this pod, you’ll notice that Kubernetes failed to pull the image because the “kitty” tag wasn’t found:

Failed to pull image “gcr.io/google_containers/update-demo:kitty”: Tag kitty not found in repository gcr.io/google_containers/update-demo

Rolling back

Ok, now we want to undo the changes and then take our time to figure out which image tag we should use.

$ kubectl rollout undo deployment/update-demo   
deployment "update-demo" rolled back  

Everything’s back to normal, phew!

To learn more about rollback, visit rolling back a Deployment.

Updating your application (for real)

After a while, we finally figure that the right image tag is “kitten”, instead of “kitty”. Now change .spec.template.spec.containers[0].image tag from “nautilus“ to “kitten“.

$ kubectl edit deployment/update-demo  
deployment "update-demo" edited  

Now you see there are 4 cute kittens on the demo website, which means we’ve updated the app successfully! If you want to know the magic behind this, look closer at the Deployment:

$ kubectl describe deployment/update-demo  

From the events section, you’ll find that the Deployment is managing another resource called Replica Set, each controls the number of replicas of a different pod template. The Deployment enables progressive rollout by scaling up and down Replica Sets of new and old pod templates.

Conclusion

Now, you’ve learned the basic use of Deployment objects:

  1. Deploy an app with a Deployment, using kubectl run
  2. Updating the app by updating the Deployment with kubectl edit
  3. Rolling back to a previously deployed app with kubectl rollout undo But there’s so much more in Deployment that this article didn’t cover! To discover more, continue reading Deployment’s introduction.

Note: In Kubernetes 1.2, Deployment (beta release) is now feature-complete and enabled by default. For those of you who have tried Deployment in Kubernetes 1.1, please delete all Deployment 1.1 resources (including the Replication Controllers and Pods they manage) before trying out Deployments in 1.2. This is necessary because we made some non-backward-compatible changes to the API.

If you’re interested in Kubernetes and configuration, you’ll want to participate in:

Janet Kuo, Software Engineer, Google

1 “kubectl run” outputs the type and name of the resource(s) it creates. In 1.2, it now creates a deployment resource. You can use that in subsequent commands, such as “kubectl get deployment “, or “kubectl expose deployment “. If you want to write a script to do that automatically, in a forward-compatible manner, use “-o name” flag with “kubectl run”, and it will generate short output “deployments/”, which can also be used on subsequent command lines. The “–generator” flag can be used with “kubectl run” to generate other types of resources, for example, set it to “run/v1” to create a Replication Controller, which was the default in 1.1 and 1.0, and to “run-pod/v1” to create a Pod, such as for –restart=Never pods.

Kubernetes 1.2 and simplifying advanced networking with Ingress

March 31 2016

Editor’s note: This is the sixth post in a series of in-depth posts on what’s new in Kubernetes 1.2.
Ingress is currently in beta and under active development.

In Kubernetes, Services and Pods have IPs only routable by the cluster network, by default. All traffic that ends up at an edge router is either dropped or forwarded elsewhere. In Kubernetes 1.2, we’ve made improvements to the Ingress object, to simplify allowing inbound connections to reach the cluster services. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, offer name based virtual hosting and lots more.

Ingress controllers

Today, with containers or VMs, configuring a web server or load balancer is harder than it should be. Most web server configuration files are very similar. There are some applications that have weird little quirks that tend to throw a wrench in things, but for the most part, you can apply the same logic to them and achieve a desired result. In Kubernetes 1.2, the Ingress resource embodies this idea, and an Ingress controller is meant to handle all the quirks associated with a specific “class” of Ingress (be it a single instance of a load balancer, or a more complicated setup of frontends that provide GSLB, CDN, DDoS protection etc). An Ingress Controller is a daemon, deployed as a Kubernetes Pod, that watches the ApiServer’s /ingresses endpoint for updates to the Ingress resource. Its job is to satisfy requests for ingress.

Your Kubernetes cluster must have exactly one Ingress controller that supports TLS for the following example to work. If you’re on a cloud-provider, first check the “kube-system” namespace for an Ingress controller RC. If there isn’t one, you can deploy the nginx controller, or write your own in < 100 lines of code.

Please take a minute to look over the known limitations of existing controllers (gce, nginx).

TLS termination and HTTP load-balancing

Since the Ingress spans Services, it’s particularly suited for load balancing and centralized security configuration. If you’re familiar with the go programming language, Ingress is like net/http’s “Server” for your entire cluster. The following example shows you how to configure TLS termination. Load balancing is not optional when dealing with ingress traffic, so simply creating the object will configure a load balancer.

First create a test Service. We’ll run a simple echo server for this example so you know exactly what’s going on. The source is here.

$ kubectl run echoheaders   
--image=gcr.io/google\_containers/echoserver:1.3 --port=8080  
$ kubectl expose deployment echoheaders --target-port=8080   
--type=NodePort  

If you’re on a cloud-provider, make sure you can reach the Service from outside the cluster through its node port.

$ NODE_IP=$(kubectl get node `kubectl get po -l run=echoheaders 
--template ''` --template
'')
$ NODE_PORT=$(kubectl get svc echoheaders --template '')
$ curl $NODE_IP:$NODE_PORT

This is a sanity check that things are working as expected. If the last step hangs, you might need a firewall rule.

Now lets create our TLS secret:

$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout   

/tmp/tls.key -out /tmp/tls.crt -subj "/CN=echoheaders/O=echoheaders"

$ echo "  
apiVersion: v1  
kind: Secret  
metadata:
  name: tls  
data:  
  tls.crt: `base64 -w 0 /tmp/tls.crt`  
  tls.key: `base64 -w 0 /tmp/tls.key`  
" | kubectl create -f   

And the Ingress:

$ echo "

apiVersion: extensions/v1beta1

kind: Ingress

metadata:

  name: test

spec:

  tls:

  - secretName: tls
  backend:  
    serviceName: echoheaders  
    servicePort: 8080  
" | kubectl create -f -  

You should get a load balanced IP soon:

$ kubectl get ing   
NAME      RULE      BACKEND            ADDRESS         AGE  
test      -         echoheaders:8080   130.X.X.X     4m  

And if you wait till the Ingress controller marks your backends as healthy, you should see requests to that IP on :80 getting redirected to :443 and terminated using the given TLS certificates.

$ curl 130.X.X.X  
\<html\>  
\<head\>\<title\>301 Moved Permanently\</title\>\</head\>\<body bgcolor="white"\>\<center\>\<h1\>301 Moved Permanently\</h1\>\</center\>  
$ curl https://130.X.X.X -kCLIENT VALUES:client\_address=10.48.0.1command=GETreal path=/  


$ curl 130.X.X.X -Lk

CLIENT VALUES:client\_address=10.48.0.1command=GETreal path=/

Future work

You can read more about the Ingress API or controllers by following the links. The Ingress is still in beta, and we would love your input to grow it. You can contribute by writing controllers or evolving the API. All things related to the meaning of the word “ingress” are in scope, this includes DNS, different TLS modes, SNI, load balancing at layer 4, content caching, more algorithms, better health checks; the list goes on.

There are many ways to participate. If you’re particularly interested in Kubernetes and networking, you’ll be interested in:

And of course for more information about the project in general, go towww.kubernetes.io

Prashanth Balasubramanian, Software Engineer

Using Spark and Zeppelin to process big data on Kubernetes 1.2

March 30 2016

Editor’s note: this is the fifth post in a series of in-depth posts on what’s new in Kubernetes 1.2 

With big data usage growing exponentially, many Kubernetes customers have expressed interest in running Apache Spark on their Kubernetes clusters to take advantage of the portability and flexibility of containers. Fortunately, with Kubernetes 1.2, you can now have a platform that runs Spark and Zeppelin, and your other applications side-by-side.

Why Zeppelin? 

Apache Zeppelin is a web-based notebook that enables interactive data analytics. As one of its backends, Zeppelin connects to Spark. Zeppelin allows the user to interact with the Spark cluster in a simple way, without having to deal with a command-line interpreter or a Scala compiler.

Why Kubernetes? 

There are many ways to run Spark outside of Kubernetes:

  • You can run it using dedicated resources, in Standalone mode 
  • You can run it on a YARN cluster, co-resident with Hadoop and HDFS 
  • You can run it on a Mesos cluster alongside other Mesos applications 

So why would you run Spark on Kubernetes?

  • A single, unified interface to your cluster: Kubernetes can manage a broad range of workloads; no need to deal with YARN/HDFS for data processing and a separate container orchestrator for your other applications. 
  • Increased server utilization: share nodes between Spark and cloud-native applications. For example, you may have a streaming application running to feed a streaming Spark pipeline, or a nginx pod to serve web traffic — no need to statically partition nodes. 
  • Isolation between workloads: Kubernetes’ Quality of Service mechanism allows you to safely co-schedule batch workloads like Spark on the same nodes as latency-sensitive servers. 

Launch Spark 

For this demo, we’ll be using Google Container Engine (GKE), but this should work anywhere you have installed a Kubernetes cluster. First, create a Container Engine cluster with storage-full scopes. These Google Cloud Platform scopes will allow the cluster to write to a private Google Cloud Storage Bucket (we’ll get to why you need that later): 

$ gcloud container clusters create spark --scopes storage-full
--machine-type n1-standard-4

Note: We’re using n1-standard-4 (which are larger than the default node size) to demonstrate some features of Horizontal Pod Autoscaling. However, Spark works just fine on the default node size of n1-standard-1.

After the cluster’s created, you’re ready to launch Spark on Kubernetes using the config files in the Kubernetes GitHub repo:

$ git clone https://github.com/kubernetes/kubernetes.git
$ kubectl create -f kubernetes/examples/spark

‘kubernetes/examples/spark’ is a directory, so this command tells kubectl to create all of the Kubernetes objects defined in all of the YAML files in that directory. You don’t have to clone the entire repository, but it makes the steps of this demo just a little easier.

The pods (especially Apache Zeppelin) are somewhat large, so may take some time for Docker to pull the images. Once everything is running, you should see something similar to the following:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
spark-master-controller-v4v4y 1/1 Running 0 21h
spark-worker-controller-7phix 1/1 Running 0 21h
spark-worker-controller-hq9l9 1/1 Running 0 21h
spark-worker-controller-vwei5 1/1 Running 0 21h
zeppelin-controller-t1njl 1/1 Running 0 21h

You can see that Kubernetes is running one instance of Zeppelin, one Spark master and three Spark workers.

Set up the Secure Proxy to Zeppelin 

Next you’ll set up a secure proxy from your local machine to Zeppelin, so you can access the Zeppelin instance running in the cluster from your machine. (Note: You’ll need to change this command to the actual Zeppelin pod that was created on your cluster.)

$ kubectl port-forward zeppelin-controller-t1njl 8080:8080

This establishes a secure link to the Kubernetes cluster and pod (zeppelin-controller-t1njl) and then forwards the port in question (8080) to local port 8080, which will allow you to use Zeppelin safely.

Now that I have Zeppelin up and running, what do I do with it? 

For our example, we’re going to show you how to build a simple movie recommendation model. This is based on the code on the Spark website, modified slightly to make it interesting for Kubernetes. 

Now that the secure proxy is up, visit http://localhost:8080/. You should see an intro page like this:

Click “Import note,” give it an arbitrary name (e.g. “Movies”), and click “Add from URL.” For a URL, enter:

https://gist.githubusercontent.com/zmerlynn/875fed0f587d12b08ec9/raw/6
eac83e99caf712482a4937800b17bbd2e7b33c4/movies.json

Then click “Import Note.” This will give you a ready-made Zeppelin note for this demo. You should now have a “Movies” notebook (or whatever you named it). If you click that note, you should see a screen similar to this:

You can now click the Play button, near the top-right of the PySpark code block, and you’ll create a new, in-memory movie recommendation model! In the Spark application model, Zeppelin acts as a Spark Driver Program, interacting with the Spark cluster master to get its work done. In this case, the driver program that’s running in the Zeppelin pod fetches the data and sends it to the Spark master, which farms it out to the workers, which crunch out a movie recommendation model using the code from the driver. With a larger data set in Google Cloud Storage (GCS), it would be easy to pull the data from GCS as well. In the next section, we’ll talk about how to save your data to GCS.

Working with Google Cloud Storage (Optional) 

For this demo, we’ll be using Google Cloud Storage, which will let us store our model data beyond the life of a single pod. Spark for Kubernetes is built with the Google Cloud Storage connector built-in. As long as you can access your data from a virtual machine in the Google Container Engine project where your Kubernetes nodes are running, you can access your data with the GCS connector on the Spark image.

If you want, you can change the variables at the top of the note so that the example will actually save and restore a model for the movie recommendation engine — just point those variables at a GCS bucket that you have access to. If you want to create a GCS bucket, you can do something like this on the command line:

$ gsutil mb gs://my-spark-models

You’ll need to change this URI to something that is unique for you. This will create a bucket that you can use in the example above.

Note : Computing the model and saving it is much slower than computing the model and throwing it away. This is expected. However, if you plan to reuse a model, it’s faster to compute the model and save it and then restore it each time you want to use it, rather than throw away and recompute the model each time.

Using Horizontal Pod Autoscaling with Spark (Optional) 

Spark is somewhat elastic to workers coming and going, which means we have an opportunity: we can use use Kubernetes Horizontal Pod Autoscaling to scale-out the Spark worker pool automatically, setting a target CPU threshold for the workers and a minimum/maximum pool size. This obviates the need for having to configure the number of worker replicas manually.

Create the Autoscaler like this (note: if you didn’t change the machine type for the cluster, you probably want to limit the –max to something smaller): 

$ kubectl autoscale --min=1 --cpu-percent=80 --max=10 \
  rc/spark-worker-controller

To see the full effect of autoscaling, wait for the replication controller to settle back to one replica. Use ‘kubectl get rc’ and wait for the “replicas” column on spark-worker-controller to fall back to 1.

The workload we ran before ran too quickly to be terribly interesting for HPA. To change the workload to actually run long enough to see autoscaling become active, change the “rank = 100” line in the code to “rank = 200.” After you hit play, the Spark worker pool should rapidly increase to 20 pods. It will take up to 5 minutes after the job completes before the worker pool falls back down to one replica.

Conclusion

In this article, we showed you how to run Spark and Zeppelin on Kubernetes, as well as how to use Google Cloud Storage to store your Spark model and how to use Horizontal Pod Autoscaling to dynamically size your Spark worker pool.

This is the first in a series of articles we’ll be publishing on how to run big data frameworks on Kubernetes — so stay tuned!

Please join our community and help us build the future of Kubernetes! There are many ways to participate. If you’re particularly interested in Kubernetes and big data, you’ll be interested in:

 – Zach Loafman, Software Engineer, Google

Building highly available applications using Kubernetes new multi-zone clusters (a.k.a. 'Ubernetes Lite')

March 29 2016

Editor’s note: this is the third post in a series of in-depth posts on what’s new in Kubernetes 1.2

Introduction 

One of the most frequently-requested features for Kubernetes is the ability to run applications across multiple zones. And with good reason — developers need to deploy applications across multiple domains, to improve availability in thxe advent of a single zone outage.

Kubernetes 1.2, released two weeks ago, adds support for running a single cluster across multiple failure zones (GCP calls them simply “zones,” Amazon calls them “availability zones,” here we’ll refer to them as “zones”). This is the first step in a broader effort to allow federating multiple Kubernetes clusters together (sometimes referred to by the affectionate nickname “Ubernetes”). This initial version (referred to as “Ubernetes Lite”) offers improved application availability by spreading applications across multiple zones within a single cloud provider.

Multi-zone clusters are deliberately simple, and by design, very easy to use — no Kubernetes API changes were required, and no application changes either. You simply deploy your existing Kubernetes application into a new-style multi-zone cluster, and your application automatically becomes resilient to zone failures.

Now into some details . . .  

Ubernetes Lite works by leveraging the Kubernetes platform’s extensibility through labels. Today, when nodes are started, labels are added to every node in the system. With Ubernetes Lite, the system has been extended to also add information about the zone it’s being run in. With that, the scheduler can make intelligent decisions about placing application instances.

Specifically, the scheduler already spreads pods to minimize the impact of any single node failure. With Ubernetes Lite, via SelectorSpreadPriority, the scheduler will make a best-effort placement to spread across zones as well. We should note, if the zones in your cluster are heterogenous (e.g., different numbers of nodes or different types of nodes), you may not be able to achieve even spreading of your pods across zones. If desired, you can use homogenous zones (same number and types of nodes) to reduce the probability of unequal spreading.

This improved labeling also applies to storage. When persistent volumes are created, the PersistentVolumeLabel admission controller automatically adds zone labels to them. The scheduler (via the VolumeZonePredicate predicate) will then ensure that pods that claim a given volume are only placed into the same zone as that volume, as volumes cannot be attached across zones.

Walkthrough 

We’re now going to walk through setting up and using a multi-zone cluster on both Google Compute Engine (GCE) and Amazon EC2 using the default kube-up script that ships with Kubernetes. Though we highlight GCE and EC2, this functionality is available in any Kubernetes 1.2 deployment where you can make changes during cluster setup. This functionality will also be available in Google Container Engine (GKE) shortly.

Bringing up your cluster 

Creating a multi-zone deployment for Kubernetes is the same as for a single-zone cluster, but you’ll need to pass an environment variable ("MULTIZONE”) to tell the cluster to manage multiple zones. We’ll start by creating a multi-zone-aware cluster on GCE and/or EC2.

GCE:

curl -sS https://get.k8s.io | MULTIZONE=true KUBERNETES_PROVIDER=gce
KUBE_GCE_ZONE=us-central1-a NUM_NODES=3 bash

EC2:

curl -sS https://get.k8s.io | MULTIZONE=true KUBERNETES_PROVIDER=aws
KUBE_AWS_ZONE=us-west-2a NUM_NODES=3 bash

At the end of this command, you will have brought up a cluster that is ready to manage nodes running in multiple zones. You’ll also have brought up NUM_NODES nodes and the cluster’s control plane (i.e., the Kubernetes master), all in the zone specified by KUBE_{GCE,AWS}_ZONE. In a future iteration of Ubernetes Lite, we’ll support a HA control plane, where the master components are replicated across zones. Until then, the master will become unavailable if the zone where it is running fails. However, containers that are running in all zones will continue to run and be restarted by Kubelet if they fail, thus the application itself will tolerate such a zone failure.

Nodes are labeled 

To see the additional metadata added to the node, simply view all the labels for your cluster (the example here is on GCE):

$ kubectl get nodes --show-labels

NAME STATUS AGE LABELS
kubernetes-master Ready,SchedulingDisabled 6m        
beta.kubernetes.io/instance-type=n1-standard-1,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-master
kubernetes-minion-87j9 Ready 6m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-87j9
kubernetes-minion-9vlv Ready 6m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-9vlv
kubernetes-minion-a12q Ready 6m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-a12q

The scheduler will use the labels attached to each of the nodes (failure-domain.beta.kubernetes.io/region for the region, and failure-domain.beta.kubernetes.io/zone for the zone) in its scheduling decisions.

Add more nodes in a second zone 

Let’s add another set of nodes to the existing cluster, but running in a different zone (us-central1-b for GCE, us-west-2b for EC2). We run kube-up again, but by specifying KUBE_USE_EXISTING_MASTER=1 kube-up will not create a new master, but will reuse one that was previously created.

GCE:

KUBE_USE_EXISTING_MASTER=true MULTIZONE=true KUBERNETES_PROVIDER=gce
KUBE_GCE_ZONE=us-central1-b NUM_NODES=3 kubernetes/cluster/kube-up.sh

On EC2, we also need to specify the network CIDR for the additional subnet, along with the master internal IP address:

KUBE_USE_EXISTING_MASTER=true MULTIZONE=true KUBERNETES_PROVIDER=aws
KUBE_AWS_ZONE=us-west-2b NUM_NODES=3 KUBE_SUBNET_CIDR=172.20.1.0/24
MASTER_INTERNAL_IP=172.20.0.9 kubernetes/cluster/kube-up.sh

View the nodes again; 3 more nodes will have been launched and labelled (the example here is on GCE):

$ kubectl get nodes --show-labels

NAME STATUS AGE LABELS
kubernetes-master Ready,SchedulingDisabled 16m       
beta.kubernetes.io/instance-type=n1-standard-1,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-master
kubernetes-minion-281d Ready 2m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kub
ernetes.io/hostname=kubernetes-minion-281d
kubernetes-minion-87j9 Ready 16m       
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-87j9
kubernetes-minion-9vlv Ready 16m       
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-9vlv
kubernetes-minion-a12q Ready 17m       
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-a12q
kubernetes-minion-pp2f Ready 2m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kub
ernetes.io/hostname=kubernetes-minion-pp2f
kubernetes-minion-wf8i Ready 2m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kub
ernetes.io/hostname=kubernetes-minion-wf8i

Let’s add one more zone:

GCE:

KUBE_USE_EXISTING_MASTER=true MULTIZONE=true KUBERNETES_PROVIDER=gce
KUBE_GCE_ZONE=us-central1-f NUM_NODES=3 kubernetes/cluster/kube-up.sh

EC2:

KUBE_USE_EXISTING_MASTER=true MULTIZONE=true KUBERNETES_PROVIDER=aws
KUBE_AWS_ZONE=us-west-2c NUM_NODES=3 KUBE_SUBNET_CIDR=172.20.2.0/24
MASTER_INTERNAL_IP=172.20.0.9 kubernetes/cluster/kube-up.sh

Verify that you now have nodes in 3 zones:

kubectl get nodes --show-labels

Highly available apps, here we come.

Deploying a multi-zone application 

Create the guestbook-go example, which includes a ReplicationController of size 3, running a simple web app. Download all the files from here, and execute the following command (the command assumes you downloaded them to a directory named “guestbook-go”:

kubectl create -f guestbook-go/

You’re done! Your application is now spread across all 3 zones. Prove it to yourself with the following commands:

$ kubectl describe pod -l app=guestbook | grep Node
Node: kubernetes-minion-9vlv/10.240.0.5
Node: kubernetes-minion-281d/10.240.0.8
Node: kubernetes-minion-olsh/10.240.0.11

$ kubectl get node kubernetes-minion-9vlv kubernetes-minion-281d 
kubernetes-minion-olsh --show-labels
NAME STATUS AGE LABELS
kubernetes-minion-9vlv Ready 34m       
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kub
ernetes.io/hostname=kubernetes-minion-9vlv
kubernetes-minion-281d Ready 20m       
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kub
ernetes.io/hostname=kubernetes-minion-281d
kubernetes-minion-olsh Ready 3m        
beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.
io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-f,kub
ernetes.io/hostname=kubernetes-minion-olsh

Further, load-balancers automatically span all zones in a cluster; the guestbook-go example includes an example load-balanced service:

$ kubectl describe service guestbook | grep LoadBalancer.Ingress
LoadBalancer Ingress: 130.211.126.21

ip=130.211.126.21

$ curl -s http://${ip}:3000/env | grep HOSTNAME
  "HOSTNAME": "guestbook-44sep",

$ (for i in `seq 20`; do curl -s http://${ip}:3000/env | grep HOSTNAME; done)  
| sort | uniq
  "HOSTNAME": "guestbook-44sep",
  "HOSTNAME": "guestbook-hum5n",
  "HOSTNAME": "guestbook-ppm40",

The load balancer correctly targets all the pods, even though they’re in multiple zones.

Shutting down the cluster 

When you’re done, clean up:

GCE:

KUBERNETES_PROVIDER=gce KUBE_USE_EXISTING_MASTER=true 
KUBE_GCE_ZONE=us-central1-f kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=gce KUBE_USE_EXISTING_MASTER=true 
KUBE_GCE_ZONE=us-central1-b kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-a 
kubernetes/cluster/kube-down.sh

EC2:

KUBERNETES_PROVIDER=aws KUBE_USE_EXISTING_MASTER=true KUBE_AWS_ZONE=us-west-2c 
kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=aws KUBE_USE_EXISTING_MASTER=true KUBE_AWS_ZONE=us-west-2b 
kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2a 
kubernetes/cluster/kube-down.sh

Conclusion 

A core philosophy for Kubernetes is to abstract away the complexity of running highly available, distributed applications. As you can see here, other than a small amount of work at cluster spin-up time, all the complexity of launching application instances across multiple failure domains requires no additional work by application developers, as it should be. And we’re just getting started!

Please join our community and help us build the future of Kubernetes! There are many ways to participate. If you’re particularly interested in scalability, you’ll be interested in:

And of course for more information about the project in general, go to www.kubernetes.io

 – Quinton Hoole, Staff Software Engineer, Google, and Justin Santa Barbara

AppFormix: Helping Enterprises Operationalize Kubernetes

March 29 2016

Today’s guest post is written Sumeet Singh, founder and CEO of AppFormix, a cloud infrastructure performance optimization service helping enterprise operators streamline their cloud operations on any OpenStack or Kubernetes cloud.

If you run clouds for a living, you’re well aware that the tools we’ve used since the client/server era for monitoring, analytics and optimization just don’t cut it when applied to the agile, dynamic and rapidly changing world of modern cloud infrastructure.

And, if you’re an operator of enterprise clouds, you know that implementing containers and container cluster management is all about giving your application developers a more agile, responsive and efficient cloud infrastructure. Applications are being rewritten and new ones developed – not for legacy environments where relatively static workloads are the norm, but for dynamic, scalable cloud environments. The dynamic nature of cloud native applications coupled with the shift to continuous deployment means that the demands placed by the applications on the infrastructure are constantly changing.

This shift necessitates infrastructure transparency and real-time monitoring and analytics. Without these key pieces, neither applications nor their underlying plumbing can deliver the low-latency user experience end users have come to expect.
  
AppFormix Architectural Review
From an operational standpoint, it is necessary to understand how applications are consuming infrastructure resources in order to maximize ROI and guarantee SLAs. AppFormix software empowers operators and developers to monitor, visualize, and control how physical resources are utilized by cloud workloads. 

At the center of the software, the AppFormix Data Platform provides a distributed analysis engine that performs configurable, real-time evaluation of in-depth, high-resolution metrics. On each host, the resource-efficient AppFormix Agent collects and evaluates multi-layer metrics from the hardware, virtualization layer, and up to the application. Intelligent agents offer sub-second response times that make it possible to detect and solve problems before they start to impact applications and users. The raw data is associated with the elements that comprise a cloud-native environment: applications, virtual machines, containers, hosts. The AppFormix Agent then publishes metrics and events to a Data Manager that stores and forwards the data to Analytics modules. Events are based on predefined or dynamic conditions set by users or infrastructure operators to make sure that SLAs and policies are being met.

Figure 1: Roll-up summary view of the Kubernetes cluster. Operators and Users can define their SLA policies and AppFormix provides with a real-time view of the health of all elements in the Kubernetes cluster. 
Figure 2: Real-Time visualization of telemetry from a Kubernetes node provides a quick overview of resource utilization on the host as well as resources consumed by the pods and containers. The user defined Labels make is easy to capture namespaces, and other metadata.

Additional subsystems are the Policy Controller and Analytics. The Policy Controller manages policies for resource monitoring, analysis, and control. It also provides role-based access control. The Analytics modules analyze metrics and events produced by Data Platform, enabling correlation across multiple elements to provide higher-level information to operators and developers. The Analytics modules may also configure policies in Policy Controller in response to conditions in the infrastructure.

AppFormix organizes elements of cloud infrastructure around hosts and instances (either containers or virtual machines), and logical groups of such elements. AppFormix integrates with cloud platforms using Adapter modules that discover the physical and virtual elements in the environment and configure those elements into the Policy Controller.

Integrating AppFormix with Kubernetes
Enterprises often run many environments located on- or off-prem, as well as running different compute technologies (VMs, containers, bare metal). The analytics platform we’ve developed at AppFormix gives Kubernetes users a single pane of glass from which to monitor and manage container clusters in private and hybrid environments.

The AppFormix Kubernetes Adapter leverages the REST-based APIs of Kubernetes to discover nodes, pods, containers, services, and replication controllers. With the relational information about each element, Kubernetes Adapter is able to represent all of these elements in our system. A pod is a group of containers. A service and a replication controller are both different types of pod groups. In addition, using the watch endpoint, Kubernetes Adapter stays aware of changes to the environment.

DevOps in the Enterprise with AppFormix
With AppFormix, developers and operators can work collaboratively to optimize applications and infrastructure. Users can access a self-service IT experience that delivers visibility into CPU, memory, storage, and network consumption by each layer of the stack: physical hardware, platform, and application software. 

  • Real-time multi-layer performance metrics - In real-time, developers can view multi-layer metrics that show container resource consumption in context of the physical node on which it executes. With this context, developers can determine if application performance is limited by the physical infrastructure, due to contention or resource exhaustion, or by application design.  
  • Proactive resource control - AppFormix Health Analytics provides policy-based actions in response to conditions in the cluster. For example, when resource consumption exceeds threshold on a worker node, Health Analytics can remove the node from the scheduling pool by invoking Kubernetes REST APIs. This dynamic control is driven by real-time monitoring at each node.
  • Capacity planning - Kubernetes will schedule workloads, but operators need to understand how the resources are being utilized. What resources have the most demand? How is demand trending over time? Operators can generate reports that provide necessary data for capacity planning.

As you can see, we’re working hard to give Kubernetes users a useful, performant toolset for both OpenStack and Kubernetes environments that allows operators to deliver self-service IT to their application developers. We’re excited to be partner contributing to the Kubernetes ecosystem and community.

– Sumeet Singh, Founder and CEO, AppFormix

How container metadata changes your point of view

March 28 2016

Today’s guest post is brought to you by Apurva Davé, VP of Marketing at Sysdig, who’ll discuss using Kubernetes metadata & Sysdig to understand what’s going on in your Kubernetes cluster. 

Sure, metadata is a fancy word. It actually means “data that describes other data.” While that definition isn’t all that helpful, it turns out metadata itself is especially helpful in container environments. When you have any complex system, the availability of metadata helps you sort and process the variety of data coming out of that system, so that you can get to the heart of an issue with less headache.

In a Kubernetes environment, metadata can be a crucial tool for organizing and understanding the way containers are orchestrated across your many services, machines, availability zones or (in the future) multiple clouds. This metadata can also be consumed by other services running on top of your Kubernetes system and can help you manage your applications.

We’ll take a look at some examples of this below, but first…

###
A quick intro to Kubernetes metadata  Kubernetes metadata is abundant in the form of labels and annotations. Labels are designed to be identifying metadata for your infrastructure, whereas annotations are designed to be non-identifying. For both, they’re simply generic key:value pairs that look like this:

"labels": {
  "key1" : "value1",
  "key2" : "value2"
}

Labels are not designed to be unique; you can expect any number of objects in your environment to carry the same label, and you can expect that an object could have many labels.

What are some examples of labels you might use? Here are just a few. WARNING: Once you start, you might find more than a few ways to use this functionality!

  • Environment: Dev, Prod, Test, UAT 
  • Customer: Cust A, Cust B, Cust C 
  • Tier: Frontend, Backend 
  • App: Cache, Web, Database, Auth 

In addition to custom labels you might define, Kubernetes also automatically applies labels to your system with useful metadata. Default labels supply key identifying information about your entire Kubernetes hierarchy: Pods, Services, Replication Controllers,and Namespaces.

Putting your metadata to work 

Once you spend a little time with Kubernetes, you’ll see that labels have one particularly powerful application that makes them essential:

Kubernetes labels allows you to easily move between a “physical” view of your hosts and containers, and a “logical” view of your applications and micro-services. 

At its core, a platform like Kubernetes is designed to orchestrate the optimal use of underlying physical resources. This is a powerful way to consume private or public cloud resources very efficiently, and sometimes you need to visualize those physical resources. In reality, however, most of the time you care about the performance of the service first and foremost.

But in a Kubernetes world, achieving that high utilization means a service’s containers may be scattered all over the place! So how do you actually measure the service’s performance? That’s where the metadata comes in. With Kubernetes metadata, you can create a deep understanding of your service’s performance, regardless of where the underlying containers are physically located.

Paint me a picture 

Let’s look at a quick example to make this more concrete: monitoring your application. Let’s work with a small, 3 node deployment running on GKE. For visualizing the environment we’ll use Sysdig Cloud. Here’s a list of the the nodes — note the “gke” prepended to the name of each host. We see some basic performance details like CPU, memory and network.

Each of these hosts has a number of containers running on it. Drilling down on the hosts, we see the containers associated with each:

Simply scanning this list of containers on a single host, I don’t see much organization to the responsibilities of these objects. For example, some of these containers run Kubernetes services (like kube-ui) and we presume others have to do with the application running (like javaapp.x).

Now let’s use some of the metadata provided by Kubernetes to take an application-centric view of the system. Let’s start by creating a hierarchy of components based on labels, in this order:

Kubernetes namespace -> replication controller -> pod -> container

This aggregates containers at corresponding levels based on the above labels. In the app UI below, this aggregation and hierarchy are shown in the grey “grouping” bar above the data about our hosts. As you can see, we have a “prod” namespace with a group of services (replication controllers) below it. Each of those replication controllers can then consist of multiple pods, which are in turn made up of containers.

In addition to organizing containers via labels, this view also aggregates metrics across relevant containers, giving a singular view into the performance of a namespace or replication controller.

In other words, with this aggregated view based on metadata, you can now start by monitoring and troubleshooting services, and drill into hosts and containers only if needed. 

Let’s do one more thing with this environment — let’s use the metadata to create a visual representation of services and the topology of their communications. Here you see our containers organized by services, but also a map-like view that shows you how these services relate to each other.

The boxes represent services that are aggregates of containers (the number in the upper right of each box tells you how many containers), and the lines represent communications between services and their latencies.

This kind of view provides yet another logical, instead of physical, view of how these application components are working together. From here I can understand service performance, relationships and underlying resource consumption (CPU in this example).

Metadata: love it, use it 

This is a pretty quick tour of metadata, but I hope it inspires you to spend a little time thinking about the relevance to your own system and how you could leverage it. Here we built a pretty simple example — apps and services — but imagine collecting metadata across your apps, environments, software components and cloud providers. You could quickly assess performance differences across any slice of this infrastructure effectively, all while Kubernetes is efficiently scheduling resource usage.

Get started with metadata for visualizing these resources today, and in a followup post we’ll talk about the power of adaptive alerting based on metadata.

– Apurva Davé is a closet Kubernetes fanatic, loves data, and oh yeah is also the VP of Marketing at Sysdig.

@Kubernetesio View on Github #kubernetes-users Stack Overflow Download Kubernetes