Part Three – Creating Kubernetes Clusters
In the last two articles, we discussed the advantages of using microservice architecture for application development and how we decided to use docker containers and Kubernetes container orchestrator for implementing the architecture. In this article, we will talk more about the implementation specifics of these technologies.
As mentioned earlier, we had already been using Docker containers in our environment albeit on a smaller scale. We had our own internal docker registry to store container images, and our engineers were already familiar with dockerizing applications.
When it came to Kubernetes, we had to start from scratch. There were many approaches available in the industry for Kubernetes adoption. The Yahoo Small Business technical infrastructure was running a hybrid cloud model – we have half of our systems running in our own data centers and the other half running in AWS cloud – which meant that we could create a Kubernetes cluster in either of these environments.
We first started with a proof of concept Kubernetes cluster in our datacenter environment. But soon realised that the tools and ecosystem to create a cluster from scratch in a datacenter had not matured enough for our needs at that time. We would also have to depend heavily on our datacenter engineers and network engineers to make the many changes required for a production-ready cluster to run in a datacenter. Our docker image registry was running in the AWS cloud, and integrating this with our datacenter environment was also causing difficulties.
This prompted us to turn our attention to the AWS cloud. Even there, there were multiple ways of getting a Kubernetes cluster up and running. The two primary methods were using an open-source tool called Kops or using a tool provided by AWS called EKS. We as a team tried these two options and started analyzing the pros and cons of these two approaches.
EKS (Elastic Kubernetes Service) is a proprietary K8s cluster creation tool developed by AWS which became generally available in mid-2018. When we used EKS for creating a cluster, we found that it was lacking many features which we were interested in. We wanted our Kubernetes API endpoint to be private for security reasons but there was no option available to do that with EKS at that point of time. We were not even able to restrict the K8s API load balancer by providing a range of IP addresses. These features were part of the EKS product roadmap, but there was no timeline provided on their availability. One more thing with EKS was that the Kubernetes API servers were completely managed by AWS engineers themselves and if we wanted to make any customizations at the server level, it was difficult.
Kops, which stands for Kubernetes operations, is a tool that was created by the open-source community around 2016. Kops can be used to create Kubernetes clusters in AWS or Google Cloud. There was very good community support for this project and fairly good documentation. If we used Kops to create a cluster, the Kubernetes API servers and worker nodes would be fully managed by us which would make it easier to customize the configuration. We also have the option to easily make the K8s API servers either private or restricted. These things prompted us to use Kops for our cluster creation and we decided to re-evaluate EKS if and when the features we were looking for became ready.
Kops is written in Go programming language and it works in the command line interface. The first step to create a Kubernetes cluster with Kops is to generate a configuration file with all the desired settings in yaml format using the command “kops create cluster”. Once this is done, we can review and edit the configuration file for further customizations and do the actual cluster creation using the “kops update cluster –yes” command. This configuration file-based approach helps us to stick to “declarative management” of our technical infrastructure. One can declare how the cluster should be in the cluster configuration file, get it reviewed with another engineer, commit the file to the source control repository (we use Github at YSB) and then create the cluster based on that configuration. In case we need to recreate the cluster or revert to a recent change, everything is tracked and available in the source code repository.
When we created a cluster with Kops with the default available settings we realized that to make it production-ready for the YSB standards we would have to make a lot of customizations. The default cluster was using a Linux operating system flavor called Debian whereas at YSB we use a flavor called Centos. Also, the default cluster was being created in its own AWS VPC and subnets but we wanted to use existing YSB VPCs and subnets. We also wanted our engineers to have individual user accounts in the Kubernetes cluster instead of re-using the default admin user account. When we dug deeper into the Kops documentation and codebase, we found out that all these things were supported in Kops by modifying the configuration options.
Many of these changes were relatively easy but some of them, particularly related to security, required more work. Individual user authentication was on the top of the list and we implemented this using an open-source project called aws-iam-authenticator. We also enabled RBAC (Role Based Access Control) in the cluster so that different groups of engineers could have different levels of access. Another major change was implementing a tool called Kube2iam to provide individual Kubernetes pods with granular aws role level access.
Once we iterated and finalized the customizations it was cluster creation time! Just creating a single cluster was not enough for the YSB environment because we follow a three environment model for our application development and deployment. We have our QA environment where we do the initial testing, a stage environment where we do further testing including performance tests, and the final production environment where the actual application runs and serves our end users. Based on this, we created one cluster for the QA environment and another for our stage environment. But for the production environment, we wanted to have more reliability since we take the uptime of our products and services very seriously at YSB. To ensure this, we created two production K8s clusters and divided the production traffic between these two clusters equally. Each of these clusters will be able to handle the production load individually if the need arises. This way, if there is an issue with one of the clusters, the production traffic can be served from the other cluster. This also helps in case we want to take down one cluster for maintenance.
Once we have all of these environments setup, the next steps will be to implement encryption within the cluster, provide tooling for canary deployments, and much more. In the upcoming articles we will talk more about these steps. Read part four here.