Kubernetes Blog

The Bet on Kubernetes, a Red Hat Perspective

July 21 2016

Editor’s note: Today’s guest post is from a Kubernetes contributor Clayton Coleman, Architect on OpenShift at Red Hat, sharing their adoption of the project from its beginnings.

Two years ago, Red Hat made a big bet on Kubernetes. We bet on a simple idea: that an open source community is the best place to build the future of application orchestration, and that only an open source community could successfully integrate the diverse range of capabilities necessary to succeed. As a Red Hatter, that idea is not far-fetched - we’ve seen it successfully applied in many communities, but we’ve also seen it fail, especially when a broad reach is not supported by solid foundations. On the one year anniversary of Kubernetes 1.0, two years after the first open-source commit to the Kubernetes project, it’s worth asking the question:

Was Kubernetes the right bet?

The success of software is measured by the successes of its users - whether that software enables for them new opportunities or efficiencies. In that regard, Kubernetes has succeeded beyond our wildest dreams. We know of hundreds of real production deployments of Kubernetes, in the enterprise through Red Hat’s multi-tenant enabled OpenShift distribution, on Google Container Engine (GKE), in heavily customized versions run by some of the world’s largest software companies, and through the education, entertainment, startup, and do-it-yourself communities. Those deployers report improved time to delivery, standardized application lifecycles, improved resource utilization, and more resilient and robust applications. And that’s just from customers or contributors to the community - I would not be surprised if there were now thousands of installations of Kubernetes managing tens of thousands of real applications out in the wild.

I believe that reach to be a validation of the vision underlying Kubernetes: to build a platform for all applications by providing tools for each of the core patterns in distributed computing. Those patterns:

  • simple replicated web software
  • distributed load balancing and service discovery
  • immutable images run in containers
  • co-location of related software into pods
  • simplified consumption of network attached storage
  • flexible and powerful resource scheduling
  • running batch and scheduled jobs alongside service workloads
  • managing and maintaining clustered software like databases and message queues

Allow developers and operators to move to the next scale of abstraction, just like they have enabled Google and others in the tech ecosystem to scale to datacenter computers and beyond. From Kubernetes 1.0 to 1.3 we have continually improved the power and flexibility of the platform while ALSO improving performance, scalability, reliability, and usability. The explosion of integrations and tools that run on top of Kubernetes further validates core architectural decisions to be composable, to expose open and flexible APIs, and to deliberately limit the core platform and encourage extension.

Today Kubernetes has one of the largest and most vibrant communities in the open source ecosystem, with almost a thousand contributors, one of the highest human-generated commit rates of any single-repository project on GitHub, over a thousand projects based around Kubernetes, and correspondingly active Stack Overflow and Slack channels. Red Hat is proud to be part of this ecosystem as the largest contributor to Kubernetes after Google, and every day more companies and individuals join us. The idea of Kubernetes found fertile ground, and you, the community, provided the excitement and commitment that made it grow.

So, did we bet correctly? For all the reasons above, and hundreds more: Yes.

What’s next?

Happy as we are with the success of Kubernetes, this is no time to rest! While there are many more features and improvements we want to build into Kubernetes, I think there is a general consensus that we want to focus on the only long term goal that matters - a healthy, successful, and thriving technical community around Kubernetes. As John F. Kennedy probably said: 

> Ask not what your community can do for you, but what you can do for your community

In a recent post to the kubernetes-dev list, Brian Grant laid out a great set of near term goals - goals that help grow the community, refine how we execute, and enable future expansion. In each of the Kubernetes Special Interest Groups we are trying to build sustainable teams that can execute across companies and communities, and we are actively working to ensure each of these SIGs is able to contribute, coordinate, and deliver across a diverse range of interests under one vision for the project.

Of special interest to us is the story of extension - how the core of Kubernetes can become the beating heart of the datacenter operating system, and enable even more patterns for application management to build on top of Kubernetes, not just into it. Work done in the 1.2 and 1.3 releases around third party APIs, API discovery, flexible scheduler policy, external authorization and authentication (beyond those built into Kubernetes) is just the start. When someone has a need, we want them to easily find a solution, and we also want it to be easy for others to consume and contribute to that solution. Likewise, the best way to prove ideas is to prototype them against real needs and to iterate against real problems, which should be easy and natural.

By Kubernetes’ second birthday, I hope to reflect back on a long year of refinement, user success, and community participation. It has been a privilege and an honor to contribute to Kubernetes, and it still feels like we are just getting started. Thank you, and I hope you come along for the ride!

– Clayton Coleman, Contributor and Architect on Kubernetes and OpenShift at Red Hat. Follow him on Twitter and GitHub: @smarterclayton

Happy Birthday Kubernetes. Oh, the places you’ll go!

July 21 2016

Editor’s note, Today’s guest post is from an independent Kubernetes contributor, Justin Santa Barbara, sharing his reflection on growth of the project from inception to its future.

Dear K8s,

It’s hard to believe you’re only one - you’ve grown up so fast. On the occasion of your first birthday, I thought I would write a little note about why I was so excited when you were born, why I feel fortunate to be part of the group that is raising you, and why I’m eager to watch you continue to grow up!


You started with an excellent foundation - good declarative functionality, built around a solid API with a well defined schema and the machinery so that we could evolve going forwards. And sure enough, over your first year you grew so fast: autoscaling, HTTP load-balancing support (Ingress), support for persistent workloads including clustered databases (PetSets). You’ve made friends with more clouds (welcome Azure & OpenStack to the family), and even started to span zones and clusters (Federation). And these are just some of the most visible changes - there’s so much happening inside that brain of yours!

I think it’s wonderful you’ve remained so open in all that you do - you seem to write down everything on Github - for better or worse. I think we’ve all learned a lot about that on the way, like the perils of having engineers make scaling statements that are then weighed against claims made without quite the same framework of precision and rigor. But I’m proud that you chose not to lower your standards, but rose to the challenge and just ran faster instead - it might not be the most realistic approach, but it is the only way to move mountains!

And yet, somehow, you’ve managed to avoid a lot of the common dead-ends that other open source software has fallen into, particularly as those projects got bigger and the developers end up working on it more than they use it directly. How did you do that? There’s a probably-apocryphal story of an employee at IBM that makes a huge mistake, and is summoned to meet with the big boss, expecting to be fired, only to be told “We just spent several million dollars training you. Why would we want to fire you?”. Despite all the investment google is pouring into you (along with Redhat and others), I sometimes wonder if the mistakes we are avoiding could be worth even more. There is a very open development process, yet there’s also an “oracle” that will sometimes course-correct by telling us what happens two years down the road if we make a particular design decision. This is a parent you should probably listen to!

And so although you’re only a year old, you really have an old soul. I’m just one of the many people raising you, but it’s a wonderful learning experience for me to be able to work with the people that have built these incredible systems and have all this domain knowledge. Yet because we started from scratch (rather than taking the existing Borg code) we’re at the same level and can still have genuine discussions about how to raise you. Well, at least as close to the same level as we could ever be, but it’s to their credit that they are all far too nice ever to mention it!

If I would pick just two of the wise decisions those brilliant people made:

  • Labels & selectors give us declarative “pointers”, so we can say “why” we want things, rather than listing the things directly. It’s the secret to how you can scale to great heights; not by naming each step, but saying “a thousand more steps just like that first one”.
  • Controllers are state-synchronizers: we specify the goals, and your controllers will indefatigably work to bring the system to that state. They work through that strongly-typed API foundation, and are used throughout the code, so Kubernetes is more of a set of a hundred small programs than one big one. It’s not enough to scale to thousands of nodes technically; the project also has to scale to thousands of developers and features; and controllers help us get there.

And so on we will go! We’ll be replacing those controllers and building on more, and the API-foundation lets us build anything we can express in that way - with most things just a label or annotation away! But your thoughts will not be defined by language: with third party resources you can express anything you choose. Now we can build Kubernetes without building in Kubernetes, creating things that feel as much a part of Kubernetes as anything else. Many of the recent additions, like ingress, DNS integration, autoscaling and network policies were done or could be done in this way. Eventually it will be hard to imagine you before these things, but tomorrow’s standard functionality can start today, with no obstacles or gatekeeper, maybe even for an audience of one.

So I’m looking forward to seeing more and more growth happen further and further from the core of Kubernetes. We had to work our way through those phases; starting with things that needed to happen in the kernel of Kubernetes - like replacing replication controllers with deployments. Now we’re starting to build things that don’t require core changes. But we’re still still talking about infrastructure separately from applications. It’s what comes next that gets really interesting: when we start building applications that rely on the Kubernetes APIs. We’ve always had the Cassandra example that uses the Kubernetes API to self-assemble, but we haven’t really even started to explore this more widely yet. In the same way that the S3 APIs changed how we build things that remember, I think the k8s APIs are going to change how we build things that think.

So I’m looking forward to your second birthday: I can try to predict what you’ll look like then, but I know you’ll surpass even the most audacious things I can imagine. Oh, the places you’ll go!

– Justin Santa Barbara, Independent Kubernetes Contributor

Bringing End-to-End Kubernetes Testing to Azure (Part 2)

July 18 2016

Editor’s Note: Today’s guest post is Part II from a series by Travis Newhouse, Chief Architect at AppFormix, writing about their contributions to Kubernetes.

Historically, Kubernetes testing has been hosted by Google, running e2e tests on Google Compute Engine (GCE) and Google Container Engine (GKE). In fact, the gating checks for the submit-queue are a subset of tests executed on these test platforms. Federated testing aims to expand test coverage by enabling organizations to host test jobs for a variety of platforms and contribute test results to benefit the Kubernetes project. Members of the Kubernetes test team at Google and SIG-Testing have created a Kubernetes test history dashboard that publishes the results from all federated test jobs (including those hosted by Google).

In this blog post, we describe extending the e2e test jobs for Azure, and show how to contribute a federated test to the Kubernetes project.


After successfully implementing “development distro” scripts to automate deployment of Kubernetes on Azure, our next goal was to run e2e integration tests and share the results with the Kubernetes community.

We automated our workflow for executing e2e tests of Kubernetes on Azure by defining a nightly job in our private Jenkins server. Figure 2 shows the workflow that uses kube-up.sh to deploy Kubernetes on Ubuntu virtual machines running in Azure, then executes the e2e tests. On completion of the tests, the job uploads the test results and logs to a Google Cloud Storage directory, in a format that can be processed by the scripts that produce the test history dashboard. Our Jenkins job uses the hack/jenkins/e2e-runner.sh and hack/jenkins/upload-to-gcs.sh scripts to produce the results in the correct format.

Kubernetes on Azure - Flow Chart - New Page.png
Figure 2 - Nightly test job workflow


Throughout our work to create the Azure e2e test job, we have collaborated with members of SIG-Testing to find a way to publish the results to the Kubernetes community. The results of this collaboration are documentation and a streamlined process to contribute results from a federated test job. The steps to contribute e2e test results can be summarized in 4 steps.

  1. Create a Google Cloud Storage bucket in which to publish the results.
  2. Define an automated job to run the e2e tests. By setting a few environment variables, hack/jenkins/e2e-runner.sh deploys Kubernetes binaries and executes the tests.
  3. Upload the results using hack/jenkins/upload-to-gcs.sh.
  4. Incorporate the results into the test history dashboard by submitting a pull-request with modifications to a few files in kubernetes/test-infra.

The federated tests documentation describes these steps in more detail. The scripts to run e2e tests and upload results simplifies the work to contribute a new federated test job. The specific steps to set up an automated test job and an appropriate environment in which to deploy Kubernetes are left to the reader’s preferences. For organizations using Jenkins, the jenkins-job-builder configurations for GCE and GKE tests may provide helpful examples.


The e2e tests on Azure have been running for several weeks now. During this period, we have found two issues in Kubernetes. Weixu Zhuang immediately published fixes that have been merged into the Kubernetes master branch.

The first issue happened when we wanted to bring up the Kubernetes cluster using SaltStack on Azure using Ubuntu VMs. A commit (07d7cfd3) modified the OpenVPN certificate generation script to use a variable that was only initialized by scripts in the cluster/ubuntu. Strict checking on existence of parameters by the certificate generation script caused other platforms that use the script to fail (e.g. our changes to support Azure). We submitted a pull-request that fixed the issue by initializing the variable with a default value to make the certificate generation scripts more robust across all platform types.

The second pull-request cleaned up an unused import in the Daemonset unit test file. The import statement broke the unit tests with golang 1.4. Our nightly Jenkins job helped us find this error and we promptly pushed a fix for it.


The addition of a nightly e2e test job for Kubernetes on Azure has helped to define the process to contribute a federated test to the Kubernetes project. During the course of the work, we also saw the immediate benefit of expanding test coverage to more platforms when our Azure test job identified compatibility issues.

We want to thank Aaron Crickenberger, Erick Fejta, Joe Finney, and Ryan Hutchinson for their help to incorporate the results of our Azure e2e tests into the Kubernetes test history. If you’d like to get involved with testing to create a stable, high quality releases of Kubernetes, join us in the Kubernetes Testing SIG (sig-testing).

–Travis Newhouse, Chief Architect at AppFormix

Dashboard - Full Featured Web Interface for Kubernetes

July 15 2016

Editor’s note: this post is part of a series of in-depth articles on what’s new in Kubernetes 1.3

Kubernetes Dashboard is a project that aims to bring a general purpose monitoring and operational web interface to the Kubernetes world. Three months ago we released the first production ready version, and since then the dashboard has made massive improvements. In a single UI, you’re able to perform majority of possible interactions with your Kubernetes clusters without ever leaving your browser. This blog post breaks down new features introduced in the latest release and outlines the roadmap for the future. 

Full-Featured Dashboard

Thanks to a large number of contributions from the community and project members, we were able to deliver many new features for Kubernetes 1.3 release. We have been carefully listening to all the great feedback we have received from our users (see the summary infographics) and addressed the highest priority requests and pain points.

The Dashboard UI now handles all workload resources. This means that no matter what workload type you run, it is visible in the web interface and you can do operational changes on it. For example, you can modify your stateful MySQL installation with Pet Sets, do a rolling update of your web server with Deployments or install cluster monitoring with DaemonSets. 

In addition to viewing resources, you can create, edit, update, and delete them. This feature enables many use cases. For example, you can kill a failed Pod, do a rolling update on a Deployment, or just organize your resources. You can also export and import YAML configuration files of your cloud apps and store them in a version control system.

The release includes a beta view of cluster nodes for administration and operational use cases. The UI lists all nodes in the cluster to allow for overview analysis and quick screening for problematic nodes. The details view shows all information about the node and links to pods running on it.

There are also many smaller scope new features that the we shipped with the release, namely: support for namespaced resources, internationalization, performance improvements, and many bug fixes (find out more in the release notes). All these improvements result in a better and simpler user experience of the product.

Future Work

The team has ambitious plans for the future spanning across multiple use cases. We are also open to all feature requests, which you can post on our issue tracker.

Here is a list of our focus areas for the following months:

  • Handle more Kubernetes resources - To show all resources that a cluster user may potentially interact with. Once done, Dashboard can act as a complete replacement for CLI. 
  • Monitoring and troubleshooting - To add resource usage statistics/graphs to the objects shown in Dashboard. This focus area will allow for actionable debugging and troubleshooting of cloud applications.
  • Security, auth and logging in - Make Dashboard accessible from networks external to a Cluster and work with custom authentication systems.

Connect With Us

We would love to talk with you and hear your feedback!

– Piotr Bryk, Software Engineer, Google

Steering an Automation Platform at Wercker with Kubernetes

July 15 2016

Editor’s note: today’s guest post is by Andy Smith, the CTO of Wercker, sharing how Kubernetes helps them save time and speed up development.  

At Wercker we run millions of containers that execute our users’ CI/CD jobs. The vast majority of them are ephemeral and only last as long as builds, tests and deploys take to run, the rest are ephemeral, too – aren’t we all –, but tend to last a bit longer and run our infrastructure. As we are running many containers across many nodes, we were in need of a highly scalable scheduler that would make our lives easier, and as such, decided to implement Kubernetes.

Wercker is a container-centric automation platform that helps developers build, test and deploy their applications. We support any number of pipelines, ranging from building code, testing API-contracts between microservices, to pushing containers to registries, and deploying to schedulers. All of these pipeline jobs run inside Docker containers and each artifact can be a Docker container.

And of course we use Wercker to build Wercker, and deploy itself onto Kubernetes!


Because we are a platform for running multi-service cloud-native code we’ve made many design decisions around isolation. On the base level we use CoreOS and cloud-init to bootstrap a cluster of heterogeneous nodes which I have named Patricians, Peasants, as well as controller nodes that don’t have a cool name and are just called Controllers. Maybe we should switch to Constables.


Patrician nodes are where the bulk of our infrastructure runs. These nodes have appropriate network interfaces to communicate with our backend services as well as be routable by various load balancers. This is where our logging is aggregated and sent off to logging services, our many microservices for reporting and processing the results of job runs, and our many microservices for handling API calls.

On the other end of the spectrum are the Peasant nodes where the public jobs are run. Public jobs consist of worker pods reading from a job queue and dynamically generating new runner pods to handle execution of the job. The job itself is an incarnation of our open source CLI tool, the same one you can run on your laptop with Docker installed. These nodes have very limited access to the rest of the infrastructure and the containers the jobs themselves run in are even further isolated.

Controllers are controllers, I bet ours look exactly the same as yours.

Dynamic Pods

Our heaviest use of the Kubernetes API is definitely our system of creating dynamic pods to serve as the runtime environment for our actual job execution. After pulling job descriptions from the queue we define a new pod containing all the relevant environment for checking out code, managing a cache, executing a job and uploading artifacts. We launch the pod, monitor its progress, and destroy it when the job is done.


In order to provide a backend for HTTP API calls and allow self-registration of handlers we make use of the Ingress system in Kubernetes. It wasn’t the clearest thing to set up, but reading through enough of the nginx example eventually got us to a good spot where it is easy to connect services to the frontend.

Upcoming Features in 1.3

While we generally treat all of our pods and containers as ephemeral and expect rapid restarts on failures, we are looking forward to Pet Sets and Init Containers as ways to optimize some of our processes. We are also pleased with official support for Minikube coming along as it improves our local testing and development. 


Kubernetes saves us the non-trivial task of managing many, many containers across many nodes. It provides a robust API and tooling for introspecting these containers, and it includes much built in support for logging, metrics, monitoring and debugging. Service discovery and networking alone saves us so much time and speeds development immensely.

Cheers to you Kubernetes, keep up the good work :)

– Andy Smith, CTO, Wercker

Cross Cluster Services - Achieving Higher Availability for your Kubernetes Applications

July 14 2016

Editor’s note: this post is part of a series of in-depth articles on what’s new in Kubernetes 1.3

As Kubernetes users scale their production deployments we’ve heard a clear desire to deploy services across zone, region, cluster and cloud boundaries. Services that span clusters provide geographic distribution, enable hybrid and multi-cloud scenarios and improve the level of high availability beyond single cluster multi-zone deployments. Customers who want their services to span one or more (possibly remote) clusters, need them to be reachable in a consistent manner from both within and outside their clusters.

In Kubernetes 1.3, our goal was to minimize the friction points and reduce the management/operational overhead associated with deploying a service with geographic distribution to multiple clusters. This post explains how to do this.

Note: Though the examples used here leverage Google Container Engine (GKE) to provision Kubernetes clusters, they work anywhere you want to deploy Kubernetes.

Let’s get started. The first step is to create is to create Kubernetes clusters into 4 Google Cloud Platform (GCP) regions using GKE.

  • asia-east1-b
  • europe-west1-b
  • us-east1-b
  • us-central1-b

Let’s run the following commands to build the clusters:

gcloud container clusters create gce-asia-east1 \

  --scopes cloud-platform \

  --zone asia-east1-b

gcloud container clusters create gce-europe-west1 \

  --scopes cloud-platform \


gcloud container clusters create gce-us-east1 \

  --scopes cloud-platform \


gcloud container clusters create gce-us-central1 \

  --scopes cloud-platform \


Let’s verify the clusters are created:

gcloud container clusters list

NAME              ZONE            MASTER\_VERSION  MASTER\_IP       NUM\_NODES  STATUS  
gce-asia-east1    asia-east1-b    1.2.4           104.XXX.XXX.XXX 3          RUNNING  
gce-europe-west1  europe-west1-b  1.2.4           130.XXX.XX.XX   3          RUNNING  
gce-us-central1   us-central1-b   1.2.4           104.XXX.XXX.XX  3          RUNNING  
gce-us-east1      us-east1-b      1.2.4           104.XXX.XX.XXX  3          RUNNING

The next step is to bootstrap the clusters and deploy the federation control plane on one of the clusters that has been provisioned. If you’d like to follow along, refer to Kelsey Hightower’s tutorial which walks through the steps involved.

Federated Services

Federated Services are directed to the Federation API endpoint and specify the desired properties of your service.

Once created, the Federated Service automatically:

  • creates matching Kubernetes Services in every cluster underlying your cluster federation,
  • monitors the health of those service “shards” (and the clusters in which they reside), and
  • manages a set of DNS records in a public DNS provider (like Google Cloud DNS, or AWS Route 53), thus ensuring that clients of your federated service can seamlessly locate an appropriate healthy service endpoint at all times, even in the event of cluster, availability zone or regional outages.

Clients inside your federated Kubernetes clusters (i.e. Pods) will automatically find the local shard of the federated service in their cluster if it exists and is healthy, or the closest healthy shard in a different cluster if it does not.

Federations of Kubernetes Clusters can include clusters running in different cloud providers (e.g. GCP, AWS), and on-premise (e.g. on OpenStack). All you need to do is create your clusters in the appropriate cloud providers and/or locations, and register each cluster’s API endpoint and credentials with your Federation API Server.

In our example, we have clusters created in 4 regions along with a federated control plane API deployed in one of our clusters, that we’ll be using to provision our service. See diagram below for visual representation.

Creating a Federated Service

Let’s list out all the clusters in our federation:

kubectl --context=federation-cluster get clusters

NAME               STATUS    VERSION   AGE  
gce-asia-east1     Ready               1m  
gce-europe-west1   Ready               57s  
gce-us-central1    Ready               47s  
gce-us-east1       Ready               34s

Let’s create a federated service object:

kubectl --context=federation-cluster create -f services/nginx.yaml

The ‘–context=federation-cluster’ flag tells kubectl to submit the request to the Federation API endpoint, with the appropriate credentials. The federated service will automatically create and maintain matching Kubernetes services in all of the clusters underlying your federation.

You can verify this by checking in each of the underlying clusters, for example:

kubectl --context=gce-asia-east1a get svc nginx  
nginx   80/TCP    9m

The above assumes that you have a context named ‘gce-asia-east1a’ configured in your client for your cluster in that zone. The name and namespace of the underlying services will automatically match those of the federated service that you created above.

The status of your Federated Service will automatically reflect the real-time status of the underlying Kubernetes services, for example:

kubectl --context=federation-cluster describe services nginx  

Name:                   nginx  
Namespace:              default  
Labels:                 run=nginx  
Selector:               run=nginx  
Type:                   LoadBalancer  
LoadBalancer Ingress:   104.XXX.XX.XXX, 104.XXX.XX.XXX, 104.XXX.XX.XXX, 104.XXX.XXX.XX  
Port:                   http    80/TCP  
Endpoints:              \<none\>  
Session Affinity:       None  
No events.

The ‘LoadBalancer Ingress’ addresses of your federated service corresponds with the ‘LoadBalancer Ingress’ addresses of all of the underlying Kubernetes services. For inter-cluster and inter-cloud-provider networking between service shards to work correctly, your services need to have an externally visible IP address. Service Type: Loadbalancer is typically used here.

Note also what we have not yet provisioned any backend Pods to receive the network traffic directed to these addresses (i.e. ‘Service Endpoints’), so the federated service does not yet consider these to be healthy service shards, and has accordingly not yet added their addresses to the DNS records for this federated service.

Adding Backend Pods

To render the underlying service shards healthy, we need to add backend Pods behind them. This is currently done directly against the API endpoints of the underlying clusters (although in future the Federation server will be able to do all this for you with a single command, to save you the trouble). For example, to create backend Pods in our underlying clusters:

for CLUSTER in asia-east1-a europe-west1-a us-east1-a us-central1-a  
kubectl --context=$CLUSTER run nginx --image=nginx:1.11.1-alpine --port=80  

Verifying Public DNS Records

Once the Pods have successfully started and begun listening for connections, Kubernetes in each cluster (via automatic health checks) will report them as healthy endpoints of the service in that cluster. The cluster federation will in turn consider each of these service ‘shards’ to be healthy, and place them in serving by automatically configuring corresponding public DNS records. You can use your preferred interface to your configured DNS provider to verify this. For example, if your Federation is configured to use Google Cloud DNS, and a managed DNS domain ‘example.com’:

$ gcloud dns managed-zones describe example-dot-com   

creationTime: '2016-06-26T18:18:39.229Z'  
description: Example domain for Kubernetes Cluster Federation  
dnsName: example.com.  
id: '3229332181334243121'  
kind: dns#managedZone  
name: example-dot-com  
- ns-cloud-a1.googledomains.com.  
- ns-cloud-a2.googledomains.com.  
- ns-cloud-a3.googledomains.com.  
- ns-cloud-a4.googledomains.com.  

$ gcloud dns record-sets list --zone example-dot-com  

NAME                                                                                                 TYPE      TTL     DATA  
example.com.                                                                                       NS     21600  ns-cloud-e1.googledomains.com., ns-cloud-e2.googledomains.com.  
example.com.                                                                                      SOA     21600 ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 1209600 300  
nginx.mynamespace.myfederation.svc.example.com.                            A     180     104.XXX.XXX.XXX, 130.XXX.XX.XXX, 104.XXX.XX.XXX, 104.XXX.XXX.XX  
nginx.mynamespace.myfederation.svc.us-central1-a.example.com.     A     180     104.XXX.XXX.XXX  
nginx.mynamespace.myfederation.svc.us-central1.example.com.         A    180     104.XXX.XXX.XXX, 104.XXX.XXX.XXX, 104.XXX.XXX.XXX  
nginx.mynamespace.myfederation.svc.asia-east1-a.example.com.       A    180     130.XXX.XX.XXX  
nginx.mynamespace.myfederation.svc.asia-east1.example.com.           A    180     130.XXX.XX.XXX, 130.XXX.XX.XXX  
nginx.mynamespace.myfederation.svc.europe-west1.example.com.  CNAME    180   nginx.mynamespace.myfederation.svc.example.com.  
... etc.

Note: If your Federation is configured to use AWS Route53, you can use one of the equivalent AWS tools, for example:

$aws route53 list-hosted-zones


$aws route53 list-resource-record-sets --hosted-zone-id Z3ECL0L9QLOVBX

Whatever DNS provider you use, any DNS query tool (for example ‘dig’ or ‘nslookup’) will of course also allow you to see the records created by the Federation for you.

Discovering a Federated Service from pods Inside your Federated Clusters

By default, Kubernetes clusters come preconfigured with a cluster-local DNS server (‘KubeDNS’), as well as an intelligently constructed DNS search path which together ensure that DNS queries like “myservice”, “myservice.mynamespace”, “bobsservice.othernamespace” etc issued by your software running inside Pods are automatically expanded and resolved correctly to the appropriate service IP of services running in the local cluster.

With the introduction of Federated Services and Cross-Cluster Service Discovery, this concept is extended to cover Kubernetes services running in any other cluster across your Cluster Federation, globally. To take advantage of this extended range, you use a slightly different DNS name (e.g. myservice.mynamespace.myfederation) to resolve federated services. Using a different DNS name also avoids having your existing applications accidentally traversing cross-zone or cross-region networks and you incurring perhaps unwanted network charges or latency, without you explicitly opting in to this behavior.

So, using our NGINX example service above, and the federated service DNS name form just described, let’s consider an example: A Pod in a cluster in the us-central1-a availability zone needs to contact our NGINX service. Rather than use the service’s traditional cluster-local DNS name (“nginx.mynamespace”, which is automatically expanded to”nginx.mynamespace.svc.cluster.local”) it can now use the service’s Federated DNS name, which is”nginx.mynamespace.myfederation”. This will be automatically expanded and resolved to the closest healthy shard of my NGINX service, wherever in the world that may be. If a healthy shard exists in the local cluster, that service’s cluster-local (typically 10.x.y.z) IP address will be returned (by the cluster-local KubeDNS). This is exactly equivalent to non-federated service resolution.

If the service does not exist in the local cluster (or it exists but has no healthy backend pods), the DNS query is automatically expanded to “nginx.mynamespace.myfederation.svc.us-central1-a.example.com”. Behind the scenes, this is finding the external IP of one of the shards closest to my availability zone. This expansion is performed automatically by KubeDNS, which returns the associated CNAME record. This results in a traversal of the hierarchy of DNS records in the above example, and ends up at one of the external IP’s of the Federated Service in the local us-central1 region.

It is also possible to target service shards in availability zones and regions other than the ones local to a Pod by specifying the appropriate DNS names explicitly, and not relying on automatic DNS expansion. For example, “nginx.mynamespace.myfederation.svc.europe-west1.example.com” will resolve to all of the currently healthy service shards in Europe, even if the Pod issuing the lookup is located in the U.S., and irrespective of whether or not there are healthy shards of the service in the U.S. This is useful for remote monitoring and other similar applications.

Discovering a Federated Service from Other Clients Outside your Federated Clusters

For external clients, automatic DNS expansion described is no longer possible. External clients need to specify one of the fully qualified DNS names of the federated service, be that a zonal, regional or global name. For convenience reasons, it is often a good idea to manually configure additional static CNAME records in your service, for example:

eu.nginx.acme.com        CNAME nginx.mynamespace.myfederation.svc.europe-west1.example.com.  
us.nginx.acme.com        CNAME nginx.mynamespace.myfederation.svc.us-central1.example.com.  
nginx.acme.com             CNAME nginx.mynamespace.myfederation.svc.example.com.

That way your clients can always use the short form on the left, and always be automatically routed to the closest healthy shard on their home continent. All of the required failover is handled for you automatically by Kubernetes Cluster Federation.

Handling Failures of Backend Pods and Whole Clusters

Standard Kubernetes service cluster-IP’s already ensure that non-responsive individual Pod endpoints are automatically taken out of service with low latency. The Kubernetes cluster federation system automatically monitors the health of clusters and the endpoints behind all of the shards of your federated service, taking shards in and out of service as required. Due to the latency inherent in DNS caching (the cache timeout, or TTL for federated service DNS records is configured to 3 minutes, by default, but can be adjusted), it may take up to that long for all clients to completely fail over to an alternative cluster in in the case of catastrophic failure. However, given the number of discrete IP addresses which can be returned for each regional service endpoint (see e.g. us-central1 above, which has three alternatives) many clients will fail over automatically to one of the alternative IP’s in less time than that given appropriate configuration.


We’d love to hear feedback on Kubernetes Cross Cluster Services. To join the community:

Please give Cross Cluster Services a try, and let us know how it goes!

– Quinton Hoole, Engineering Lead, Google and Allan Naim, Product Manager, Google

Citrix + Kubernetes = A Home Run

July 14 2016

Editor’s note: today’s guest post is by Mikko Disini, a Director of Product Management at Citrix Systems, sharing their collaboration experience on a Kubernetes integration. 

Technical collaboration is like sports. If you work together as a team, you can go down the homestretch and pull through for a win. That’s our experience with the Google Cloud Platform team.

Recently, we approached Google Cloud Platform (GCP) to collaborate on behalf of Citrix customers and the broader enterprise market looking to migrate workloads. This migration required including the NetScaler Docker load balancer, CPX, into Kubernetes nodes and resolving any issues with getting traffic into the CPX proxies.  

Why NetScaler and Kubernetes?

  1. Citrix customers want the same Layer 4 to Layer 7 capabilities from NetScaler that they have on-prem as they move to the cloud as they begin deploying their container and microservices architecture with Kubernetes 
  2. Kubernetes provides a proven infrastructure for running containers and VMs with automated workload delivery
  3. NetScaler CPX provides Layer 4 to Layer 7 services and highly efficient telemetry data to a logging and analytics platform, NetScaler Management and Analytics System

I wish all our experiences working together with a technical partner were as good as working with GCP. We had a list of issues to enable our use cases and were able to collaborate swiftly on a solution. To resolve these, GCP team offered in depth technical assistance, working with Citrix such that NetScaler CPX can spin up and take over as a client-side proxy running on each host. 

Next, NetScaler CPX needed to be inserted in the data path of GCP ingress load balancer so that NetScaler CPX can spread traffic to front end web servers. The NetScaler team made modifications so that NetScaler CPX listens to API server events and configures itself to create a VIP, IP table rules and server rules to take ingress traffic and load balance across front end applications. Google Cloud Platform team provided feedback and assistance to verify modifications made to overcome the technical hurdles. Done!

NetScaler CPX use case is supported in Kubernetes 1.3. Citrix customers and the broader enterprise market will have the opportunity to leverage NetScaler with Kubernetes, thereby lowering the friction to move workloads to the cloud. 

You can learn more about NetScaler CPX here.

 – Mikko Disini, Director of Product Management - NetScaler, Citrix Systems

@Kubernetesio View on Github #kubernetes-users Stack Overflow Download Kubernetes