Kubernetes: Deployment & Services

Reeshabh Choudhary
9 min readDec 27, 2023
Kubernetes: Deployment & Services

👷♂️ Software Architecture Series — Part 14.

NOTE: The article below covers the conceptual details of Kubernetes operation. The necessary commands to create and configure deployments and services can be easily found over web, and hence are avoided for the sake of conceptual brevity.

Introduction

Kubernetes is an open source container orchestrator where pods are the most atomic unit of work. A pod can contain one or more containers and eventually the applications run as containers inside pods. However, it is tedious to manage pods on our own and Kubernetes comes with a self-healing mechanism, which restores the desired state of our application (provided via a manifest YAML file) by redeploying failed containers or pods. Kubernetes continuously monitors the health of your pods and containers. This includes liveness and readiness probes that check if the container is responding correctly and ready to serve traffic. When a container fails or becomes unhealthy, Kubernetes triggers automatic actions to bring it back in line with the desired state. These actions could be either restarting the failed container or replacing the failed pod or scaling down.

Hence, it makes sense to have higher level abstraction above pods which can be easily managed by Kubernetes Control Plane. Kubernetes exposes helpful APIs such as Deployment, StatefulSet, DaemonSet, etc. which help in managing the higher-level abstraction above pods, which is often called Workload Objects. Among these, Deployment is the most common way to run an application on Kubernetes cluster, and going forward we shall explore this topic in depth.

Deployment

In Kubernetes, a Deployment is a uniformly managed set of Pod instances based on same Docker image. One instance of a Pod is called Replica. To manage replicated set of pods in Kubernetes cluster, ReplicaSet is used as cluster-wide pod manager, which makes sure necessary pods are running all the time. ReplicaSet manages to maintain the desired number of pods running via reconciliation loop, which is fundamental to most design and implementation in Kubernetes. A Reconciliation Loop keeps running and constantly observing the current state and matching against the declared state. It is the key element behind the self-healing behavior of Kubernetes.

Although ReplicaSet is a discrete object on its own in Kubernetes, a user is encouraged to interact with ReplicaSet via Deployment. We can view a Deployment as a higher-level controller which manages both deployment transitions (e.g., rolling updates) as well as replication conditions via ReplicaSets. ReplicaSets being managed by Deployment can be applied to variety of use cases in Kubernetes environment. Apart from maintaining the desired (declared) state of the cluster, a completely new state can also be deployed based on the change in specification of PodTemplateSpec of the Deployment in the manifest YAML file. In case of a failure, a rollback can be applied to the earlier deployed state (which can be effectively managed a version controller repo). In case of excessive loads, deployments can be scaled using horizontal pod scaling based on parameters like CPU usage, etc. Multiple versions of a same application can also be supported during a RollingUpdate Deployment.

A Deployment lifecycle is mainly divided into three parts:

1. Progressing Deployment: This is the state of deployment when a new replica set is being created or there is some scale-up/down operation going on.

2. Completed Deployment: A deployment is considered complete when deployed cluster has been updated to the specified state, new pods/replicas are available and older ones have been pruned.

3. Failed Deployment: A deployment can fail due to multiple reasons such as insufficient quota, readiness probe failure, insufficient permissions, etc. A deadline parameter can be set in Deployment spec, which would be considered as threshold by the Deployment Controller before indicating a deployment has failed. In case of failure, a Deployment can be scaled up or down or even rolled back to earlier working state.

As Kubernetes accepts a declarative configuration of cluster, Deployment helps maintain the declared state of the cluster. Deployment can create and destroy pods dynamically based on their health and functioning, but main aim is to consistently maintain the desired state of the cluster. Hence, at a given time, the set of pods running an application inside a Kubernetes cluster may be completely different from set of pods running at another given instance of time. Moreover, a pod in Kubernetes will have a unique IP address assigned to it.

Now consider a scenario, if an application is divided into set of functionalities and these each of these functionalities are running as a set of pods under Deployment, how do they interact with each other consistently, given Kubernetes pods are created and destroyed dynamically and each time they have their own unique IP address assigned to themselves?

Services

So far, we are clear that each functionality/service of a microservice based application will be deployed using a Docker image to Kubernetes cluster. The deployment will create a group of pods providing the desired functionality. In Kubernetes, the Service API is a fundamental abstraction that enables you to expose groups of pods as network services within the cluster. It acts as a stable interface to access and interact with a set of pods (or endpoints) that provide a particular functionality or service. The Service defines policies for accessing and making those pods (endpoints) accessible within the Kubernetes cluster. A group of pods collectively providing a service under a common label or label selector is called as Service Object. Pods under a service object are exposed via a single access point.

To understand this, let us revisit the above scenario again. Say one functionality of the application is frontend and other is backend, and they both interact with each other. Say backend is declared to be comprising of 3 replicas and Kubernetes maintains this state at all cost. Which means, the set of pods representing backend might change internally, frontend does not need to know about this. This is the level of abstraction a Service provides in Kubernetes by providing a consistent interface to access the underlying pods that collectively offer a specific service or functionality.

Services enable other components within the cluster to discover and communicate with the pods represented by the service. They provide a stable DNS name or IP address that abstracts the individual pod IPs, facilitating seamless communication. Moreover, they also perform load balancing across the pods associated with them, distributing incoming traffic evenly among the pods to ensure efficient utilization and fault tolerance.

Pods are able to communicate with each other (a set of pods communicating with other set) via Service Discovery. Kubernetes provides a Service controller to satisfy service discovery and connectivity use cases such as Pod-to-Pod, LAN-to-Pod, and Internet-to-Pod.

Pod-to-Pod:

Pods communicating via ClusterIP

In Kubernetes, when a Service is created, it’s allocated a unique IP address called a ClusterIP. This ClusterIP serves as a stable virtual IP address accessible only from within the Kubernetes cluster. It serves as an entry point to the Service. Incoming traffic directed to this IP address is load balanced by the Kubernetes networking layer across all pods matching the Service’s selector. The ClusterIP is not exposed outside the cluster’s network, making it suitable for internal communication among the components within the Kubernetes environment.

Kubernetes comes with a built-in DNS service that provides name resolution for services within the cluster. Since the ClusterIPs assigned to services are stable and virtual, by associating these ClusterIPs with DNS names, Kubernetes offers a stable way for services to be accessed without concerns about DNS caching issues. As the cluster evolves and services are created, updated, or scaled, the DNS service dynamically updates its records to reflect changes in service IPs and endpoints.

ClusterIP manages to load balance traffic across all service endpoints using kube-proxy, which runs on every node of the cluster and monitors the Kubernetes API server for changes related to Services. It continuously watches for new Services, updates, or deletions within the cluster. Upon detecting changes or new Services, the kube-proxy programs a set of iptables rules in the kernel of the host node where it runs. These rules are configured dynamically to rewrite the destination IP addresses of incoming packets, ensuring that traffic directed towards a Service’s ClusterIP is properly routed to one of the available endpoints (pods) associated with that Service.

NOTE: While the ClusterIP is usually automatically assigned, users have the option to specify a specific ClusterIP during the creation of the Service. However, once set, the ClusterIP cannot be modified without deleting and recreating the Service object.

LAN-to-Pod:

Ports on same LAN communicating to a Kubernetes Service using NodePort

There can be scenarios when we need to access a service running within the Kubernetes cluster from outside, but the service consumer is on the same local area network (LAN). To support such communication, NodePort service type is appropriate where the Service controller publishes a discrete port in every worker Node which is mapped to the exposed deployment. This port is mapped to the service, allowing external consumers to access services within the cluster by reaching any node’s IP address on the specified NodePort. External consumers within the LAN can then communicate with the services running in the Kubernetes cluster by targeting the IP address of any worker node along with the specified NodePort. Traffic received on the specified NodePort of any node in the cluster is then routed internally to the respective service and its associated pods. However, as with any external access method, it is crucial to consider security measures and access controls to ensure that access is limited to authorized entities within the LAN.

NodePort can be integrated with hardware or software load balancers for further exposure of the service. Load balancers can distribute incoming traffic among the nodes in the cluster based on the NodePort configuration. It is often combined with ClusterIP to provide both internal and external access to services. While ClusterIP handles internal communication between services, NodePort exposes these services externally.

Internet-to-Pod:

Communicating with a microservice application in Kubernetes cluster via Internet

This is the most common scenario where a service exposed inside Kubernetes cluster is made available to Internet via HTTP APIs. This is done via LoadBalancer service type, which builds on top of NodePort service type by adding the functionality to provision a cloud-based load balancer and allocate a public IP address to route traffic to the services within the Kubernetes cluster. Kubernetes interacts with the cloud provider’s infrastructure (e.g., Google Cloud Platform, AWS, Azure) to request the creation of a load balancer. The cloud provider assigns a public IP address to the newly created load balancer. This IP serves as the entry point for incoming traffic from the internet. The cloud load balancer, configured by Kubernetes, directs incoming traffic to the specific NodePorts associated with the services defined by the LoadBalancer service. Kubernetes manages the mapping between the externally exposed NodePorts and the internal services within the cluster, ensuring that incoming traffic reaches the correct services.

The HTTP based load balancing system of Kubernetes is called Ingress, which is based on virtual hosting pattern (mechanism to host many HTTP sites on a single IP address). However, instead of leveraging a load balancer configuration file, users can define routing rules, SSL termination, and other traffic handling aspects using Ingress resources within Kubernetes. In a dynamic environment where the set of virtual hosts or applications expands or changes frequently, Ingress adapts by allowing users to modify and update routing rules without dealing with the underlying load balancer’s configuration files directly. Kubernetes merges multiple Ingress objects into a single configuration for the underlying load balancer, providing a unified and manageable view of routing rules for all services. Ingress manages traffic routing and load balancing based on defined rules, forwarding incoming HTTP/HTTPS requests to the appropriate services within the cluster.

--

--

Reeshabh Choudhary
Reeshabh Choudhary

Written by Reeshabh Choudhary

Software Architect and Developer | Author : Objects, Data & AI.

No responses yet