Update: Google Stackdriver is now Google Cloud Logging and Google Cloud Monitoring. BindPlane will continue to integrate and support both of these products.
What is Amazon EKS?
Amazon Elastic Container Service for Kubernetes (Amazon EKS) is a managed service that allows you to easily run Kubernetes on Amazon Web Services (AWS) without the pain point of having to stand up or maintain your own Kubernetes control plane. This service is the AWS equivalent of Google Kubernetes Engine (GKE) and Microsoft Azure Kubernetes Service (AKS). We will take you through how and why Monitoring Amazon EKS with Stackdriver will benefit your organization.
Understanding Kubernetes’ capabilities can be difficult at a first glance. At a high level, Kubernetes consists of two major components- a cluster of ‘worker nodes’ that run your containers and the ‘control plane’. The ‘control plane’ manages when and where containers are started on your clusters and monitors their status in real time.
You may ask, “Why do we need EKS when we have an IT guy to manage and monitor our Kubernetes deployment for us?”. When using a “non-managed” Kubernetes deployment, an administrator must manage both the Kubernetes control plane and the cluster of work nodes. This will be very difficult, and time consuming for your IT managers to handle and may even prove impossible depending on the scale of your infrastructure. With Amazon EKS, clusters of worker nodes are provisioned using the provided Amazon Machine Image (AMI) and AWS CloudFormation script. AWS handles the provisioning, scaling, and managing the Kubernetes control plane in a highly available and secure configuration. Using Amazon EKS to manage your Kubernetes deployment will help relieve the operational burden that a non-managed Kubernetes system would place on your IT team, freeing them up to handle more pressing matters.
Amazon EKS is almost always a part of an application deployment on AWS that leverages multiple other AWS services. Multi-cloud architectures often combine EKS with other Kubernetes services, and it is essential that you leverage a monitoring/observability tool that can view the entire AWS stack along with your other cloud providers(s) or data center. Fortunately, if you have applications running on Google Cloud Platform (GCP) in addition to AWS, Stackdriver allows you to unify visibility across AWS and GCP.
Stackdriver for Amazon EKS
Stackdriver has many capabilities that assist users in monitoring their environments. A particular feature that stands out when running Amazon EKS, is the ability to monitor AWS-based environments. This feature has recently been enhanced to include extensive Amazon EKS support. To enable this support, activate the BindPlane for Stackdriver integration service, and configure EKS monitoring within BindPlane.
Monitoring Amazon EKS via Logs (Beta)
Logs have become an invaluable tool when it comes to monitoring the health of your systems. Stackdriver (with BindPlane, currently in beta) provides logs that allow you to drill down into the specifics of an issue you may be experiencing with EKS and help eliminate headaches when it comes to troubleshooting your problems. Along with just notifying you that there is a problem in your system, Stackdriver logs provide you with a Json-payload that gives you the ability to view more in-depth information on each log such as the container, the ID, the severity level, time stamp, and other insights. This will provide beneficial insights into a notification or issue allowing you to locate the root cause of your problem and help you find a solution.
Stackdriver also provides a dropdown list to sort by severity level for each log so you can separate the small bugs from a catastrophic system meltdown.
With thousands of logs occurring every day, it can be a bit overwhelming to monitor and keep an eye out for a specific problem you are expecting or would like to track. To make monitoring easier, BindPlane allows you to create log-based metrics to help you keep an eye out for that one bug that is causing you more headaches than others.
These metrics will aggregate and chart the log you want to gain more insight into and allow you to see when and how often they occur in your system. You can then compare these metrics to any other logs metrics you would like to create to help you understand what issues you may be experiencing with your system.
Another feature that you can take advantage of within your log-based metrics is that you can create a performance threshold for metrics that you don’t want them to exceed. If any event exceeds this threshold, you can create an alert to notify you of this outlier. In the alert you can view the included Json-payload to give you extra context on the error and how to handle the alert. Using these alerts will really give you a leg up when it comes to monitoring the health of your system and catching events before they cause too much damage.
Monitoring Amazon EKS with Metrics
When monitoring Amazon EKS, you will find that like log-based metrics, the metrics dashboard provides extremely valuable insights into the performance and health of your system. With the metrics feature, you can create multiple graphs to track your EKS key performance indicators (KPI) to ensure your system is healthy and is running as it should. BindPlane supports 200+ metrics on the following EKS objects: Namespace Cluster, Container, Deployment, Job, Node, Pod and volume. Some examples of important KPIs you may want to track for EKS would be the “Kubernetes Pod Status” the pod status will let you view which pods are running, ready to run or failed to run. If you find that pods are failing at an unusually high rate, it may indicate that there is an underlying problem with your system such as a lack of memory or other resources required to successfully run your Kubernetes Pods.
If you do experience this problem, don’t worry! Metrics will allow you to create charts for other KPIs that may affect Pod status such as ‘CPU Usage’ and ‘Memory Usage’. These charts will help you view how much of the system’s resources that each node or pod is consuming on your network. Just like with log-based metrics you may also create an alert threshold to help you stay on top of these issues as well.
Pulling it all Together
Increasingly, operations teams require a “single pane of glass” for the management of resources where they exist, even within the data center. Streamlining visibility is a challenge all of its own as IT teams now face the task of unifying observability of multi-cloud platforms. With the help of BindPlane, linking EKS with Stackdriver will better help you manage large changes to your system. For example, you could be migrating from one cloud provider to another, the logs and metrics that Stackdriver provides will help you keep your service behavior and performance consistent (See this comparison Kubernetes services from each of the cloud platforms.)
Another example where the combined capabilities of EKS and Stackdriver will make your life easier is if you are trying to architect a true multi-cloud platform. Using EKS will help mitigate the technical challenges that arise when trying to run applications on a single environment. Stackdriver will help alert you to any maxed out resources, limited geographic reach, limited resilience, vendor lock in, and inflexible resource. No matter what your project may be, you will want to choose a toolchain like Stackdriver that supports easy integration of data and observability signals from any source you may choose.
To get started with integrating Amazon EKS with Google Stackdriver, sign up for BindPlane. Stackdriver customers can add BindPlane without any additional cost or Blue Medora billing—the metrics consumed are added to your Stackdriver bill. The BindPlane logs service is currently available in beta.