Every Saga Has a Beginning
Check out the the table of contents that contains all the articles in this series.
Welcome to A Practical Guide to Kubernetes, an article series about Kubernetes.
In this series, I plan to start from the very basics and build upon the foundation we’ve created later.
I will assume nothing and build a knowledge base bit by bit that will help whether you are just beginning to learn Kubernetes or you are a seasoned Kubernetes expert. We do not call this site Zero to Hero for nothing, do we 😉?
We will start the series with with an introduction to the basics of Kubernetes and what it is used for. We will give a high-level overview of the architecture of Kubernetes and its main components, as well as an explanation of its main use cases.
From there, we will move on to more specific topics, such as how to set up and configure a Kubernetes cluster, how to deploy applications to a Kubernetes cluster, and how to manage and monitor a Kubernetes cluster.
We will also include tutorials on common tasks in Kubernetes, such as scaling applications, rolling out updates, and managing the lifecycle of applications.
Theory, Practice? Why not both?
When it comes to learning I believe in a balanced approach of theory and practice. Throughout the series, along with fundamental theoretical knowledge, you will have plenty of examples and hands-on exercises that allow you to try out what they have learned in a practical setting. This will help you better understand the concepts and gain hands-on experience with Kubernetes.
In general, the curriculum for this series will both cover the basics of Kubernetes and its components, as well as provide tutorials and exercises that allow you to gain hands-on experience with the platform.
As the series progresses, we will cover more advanced topics and techniques for using Kubernetes in real-world scenarios.
At least that’s the plan. To begin, let’s dive right in with some definitions.
What Is Kubernetes?
Kubernetes is an open-source Container Orchestration System. It provides a platform for managing containerized workloads and services, enabling you to deploy, scale, and manage applications more quickly and efficiently. In addition, it offers a range of features to help manage containerized applications, including automatic scaling, load balancing, self-healing, rolling updates, and more.
If you are new to Kubernetes, I am sure the above description created more questions than answers for you 🙂.
Even this seemingly innocent definition requires further clarification, so let’s dive deeper into the rabbit hole 🐇.
Here are some immediate questions that you might have in mind:
- What is a Container?
- What is an Application?
- What is a Containerized Application?
- What is a Workload?
- What is a Container Orchestration System?
- Why am I here?
- Can I hear colors?
- Why, oh why didn’t I take the blue pill?
Let us tackle these questions one at a time.
What Is a Container?
A container is a lightweight, standalone executable that “contains” everything needed to run an application, including the code, runtime, system tools, libraries, settings, and configuration.
Since the application is packaged as a standalone executable, it can now be deployed in a portable, consistent, and scalable manner.
Containers can be used to package and deploy any application, from web applications to databases to microservices.
Where Do These Containers Live?
A container requires a container runtime to execute. A container runtime is a software component that runs on a host machine, manages the container lifecycle, and provides an interface for running and interacting with containers.
The container runtime starts and stops containers, manages their file system and network interfaces, and isolates them from the host system and other containers.
When a container is started, the container runtime creates a new namespace for the container, which provides a separate view of the system’s resources, such as the file system, network interfaces, and process tree. The container runtime also starts the container’s main process.
The main process is the primary process that runs inside a container and performs the primary function of the containerized application. The main process is the process that is started when the container is started, and it typically runs in the foreground and provides the main functionality of the application.
When the container is stopped, the container runtime shuts down the container’s main process and cleans up the container’s resources.
Container runtimes provide process-level isolation. This way, containers share the host machine’s kernel but have their own user space. This provides a high degree of isolation and security.
Thanks to the level of isolation provided by the container runtime, a container runs on a host machine but is isolated from other containers and from the host system: Each container has its own file system, network stack, and process space, so multiple containers can run on the same host without interfering with each other.
Is Docker a Container Runtime?
As in any profound technology, at certain boundaries, things get complicated. And the devil, as always, is in the details: Docker is a platform for building and managing containerized applications. So, at a very high level, it is a tool. And, as a tool, it uses containerd as its (default) container runtime.
So when one says “Docker is the most widely used container runtime.”, they might be correct in a sense, yet, factually incomplete 🙂.
Does That Mean Kubernetes Is a Container Runtime?
Kubernetes is not a container runtime but rather a Container Orchestration System. However, Kubernetes uses a container runtime to manage the lifecycle of containers.
The container runtime that Kubernetes uses is responsible for managing the lifecycle of containers. At the same time, Kubernetes provides higher-level features for deploying, scaling, and managing containerized workloads.
What is a Containerized Workload?
A containerized workload is an application or service packaged and deployed to a container runtime.
A containerized workload is often referred to simply as a “workload.” A workload is typically defined by a Kubernetes Object, such as a Deployment or a Job, which specifies the desired state of the workload.
What is a Kubernetes Object?
A Kubernetes Object is a persistent entity representing a cluster’s desired state. Kubernetes Objects are used to define and manage the state of the Kubernetes system and the workloads running on it.
Here are a few examples of Kubernetes Objects:
- Pod: The smallest deployable unit in Kubernetes, representing a single running process instance.
- Deployment: A higher-level abstraction that manages groups of pods.
- Service: An abstract way to expose a set of Pods to a network service, enabling external access.
- ConfigMap: A configuration store that decouples configuration data from container images, allowing for dynamic configuration changes.
There are many other Kubernetes Objects; we will explore them all in detail as we continue this series.
What Do You Mean By “Packaged”?
Packaging an application refers to bundling all the files, libraries, and other resources needed to run the application into a single distributable archive. Packaging aims to make installing, distributing, and running the application on different systems and platforms easy.
Packaging an application in containerization involves building a container image that includes the application code, runtime, libraries, and other dependencies. The container image can then be deployed to any infrastructure platform that supports containers (such as Kubernetes), providing a portable, consistent, and scalable way to run the application.
What is a Container Image?
The container image results from the packaging process defined in the above section.
A container image is a “template” to initialize one or more container instances. The container image is the “cookie cutter,” whereas the container instances are the cookies. The container runtime deploys and runs identical applications based on a given image.
How Do Containers Provide Portability, Consistency, and Scalability?
Containers provide a way to package and deploy applications in a portable, consistent, and scalable manner by encapsulating the application and all of its dependencies into a single image.
Here are some ways that containers achieve that:
Containers can run on any infrastructure platform that supports container runtimes, including physical machines, virtual machines, and cloud instances.
This means you can create a containerized application once and run it anywhere without worrying about the underlying operating system or hardware.
At least, that’s what is supposed to happen “in theory.”
Containers provide a consistent runtime environment for applications, which means that the application will run the same way in the development, testing, and production environments.
This consistency helps to reduce the risk of errors and improve the reliability of the application.
Additionally, containers make it easier to manage complex applications with multiple dependencies by encapsulating them into the container image and ensuring they are always available when the application runs.
Containers can be easily scaled up or down to meet the application’s demands.
Because each container is isolated, you can run multiple containers of the same application to handle increased traffic or work load and then scale back down when the demand decreases. This makes managing large-scale applications with fluctuating demand easier and ensures that resources are used efficiently.
What is a Container Orchestration System?
As the number of containers and hosts (or nodes) in a deployment grows, it becomes increasingly difficult to manage the system manually. Container Orchestration Systems are designed to address the challenges of managing containerized applications at scale.
A Container Orchestration System is a tool or platform that automates containerized applications’ deployment, scaling, and management, providing load balancing, auto-scaling, rolling updates, and self-healing features.
Container Orchestration Systems automate many tasks in managing a containerized application, making it easier to maintain the system’s desired state and ensure it is always available and performing well.
The desired state describes the state that the system should be in, as defined by the user or administrator. Kubernetes uses the desired state to automatically manage and maintain the state of the system and the workloads running on it.
The desired state is defined using Kubernetes Objects, which describe the properties of the system and the workloads, such as the number of replicas, the container image to use, and more.
These objects are declaratively defined using YAML.
Can I Use JSON?
Yes, if you want to torture yourself, you can use JSON too.
Once you define Kubernetes Objects in YAML, they are stored in the Kubernetes API Server.
Kubernetes constantly monitors the actual state of the system and workloads and compares it to the desired state. If the actual state does not match the desired state, Kubernetes takes action to bring the system back into the desired state.
The desired state approach to system management is a crucial feature of Kubernetes, as it provides a declarative way to manage complex systems and workloads.
Rather than manually managing the system’s state, administrators can define (or declare) the desired state using Kubernetes Objects and let Kubernetes handle the details of ensuring the system stays in the desired state.
This declarative approach helps reduce the risk of errors, improve the system’s reliability, and make managing large, complex systems more straightforward.
What is Kubernetes API Server?
The Kubernetes API Server is the central management point for a Kubernetes cluster. It provides an API (Application Programming Interface) for interacting with the Kubernetes Objects that define the cluster’s desired state.
The Kubernetes API server is responsible for validating and processing requests to the API, storing the state of the system and the workloads in a distributed key-value store, and propagating changes to the system and workloads to the other components of the cluster.
By providing a standard API for interacting with the system, the API Server makes it easy to automate the management of the system and integrate it with other tools and platforms.
What is a Cluster?
A cluster is a group of computers, servers, or virtual machines that are connected and work together to manage and run containerized workloads.
A Kubernetes cluster includes a set of worker nodes, the machines that run the containerized workloads, and a control plane. The control plane components manage the cluster and the workloads.
The control plane manages the system’s state and the workloads running on it. The control plane includes several components, such as the Kubernetes API Server, etcd, kube-scheduler, kube-controller-manager, and more, which work together to ensure that the system stays in the desired state.
Kubernetes clusters can be deployed on a variety of infrastructure platforms, including on-premises data centers, public and private clouds, and hybrid environments.
The cluster architecture enables administrators to manage large numbers of containers and workloads across multiple nodes while providing a consistent and reliable interface for interacting with the system.
What Is a Node?
A Node is a worker machine that runs containerized workloads.
A Node can be a physical machine or a virtual machine, and it is typically connected to a network and has access to storage and other resources.
Each node in a Kubernetes cluster runs a container runtime, such as containerd, that manages the containers running on the node. The node also runs a set of Kubernetes components, including kubelet, kube-proxy, and others, that work together to manage the state of the node and the workloads running on it.
The kubelet is the primary node agent that communicates with the Kubernetes API Server and ensures that the containers on the node are running in the desired state.
The kube-proxy is responsible for routing network traffic to the containers on the node and managing the network policies and services defined in the Kubernetes Objects.
What Are Some Kinds of Containerized Workloads?
A containerized workload can be a web application, a microservice, a batch job, or any other type of application that can be run in a container.
Anything you can execute as a binary can qualify for a containerized workload.
How Do We Package Applications into a Container?
To package an application into a container, you need to create a Dockerfile, a text file containing instructions for building the container image.
During this series, you will have a chance to build and package images from scratch, and I will explain the process in detail when the time comes.
But as a high-level overview, here is how you can package an application into a container:
- In your
Dockerfile, choose a base image to run your application.
- In your
Dockerfile, install the dependencies that your application needs.
- In your
Dockerfile, copy the necessary application files.
- In your
Dockerfile, configure the container by setting things like environment variables and startup commands.
- Build the container image (using the
docker buildcommand, pointing at your
- Push the created container image to an image registry.
- Reference the image in your Kubernetes Objects (such as Deployments).
Once the container image is pushed to your image registry of choice, you can use a Container Orchestration System like Kubernetes to deploy and manage the containerized application.
What is an Image Registry
An image registry is a service that provides a central location for storing and distributing container images. An image registry offers a way to store container images, which can then be accessed and downloaded by users and systems that need them.
There are several image registries available, both commercial and open-source. Docker Hub is one of the most popular image registries. Other popular image registries include Amazon Elastic Container Registry, Google Container Registry, and Azure Container Registry.
In addition to public image registries, you can also run your own private image registries (such as harbor), which can be used to store and distribute proprietary images unavailable in public registries.
Image registries provide a way to share and distribute container images quickly and efficiently, which helps to accelerate the development and deployment of containerized applications.
Where to Go from Here?
As I said, this is the first part of a series of articles that will go deeper and deeper down the Kubernetes rabbit hole. The following list of items outlines what I have in mind; I will update this section and add links to upcoming articles whenever I add new content.
Along with these articles, the official Kubernetes documentation is also an excellent place to learn more about Kubernetes.
You can also check the index page to see what comes next in this series of articles.
References and Further Reading
This section lists tools and technologies mentioned in this article and supplementary learning resources that still need to be covered but are worth exploring.
Containers and Friends
- Container Runtime Interface
- Docker Swarm
- Apache Mesos
- Amazon ECS
- Docker Hub
- Amazon Elastic Container Registry
- Google Container Registry
- Harbor Image Registry