Published 2023-05-07 12:29:49
Guide to Deploying Metrics Server on Linode LKE
TL;DR
There have been many discussions on the Linode LKE forums regarding the installation of the Metrics Server on the platform.
One common issue users face is setting up the insecure TLS and working with outdated Helm charts. This article provides a comprehensive guide to help you overcome these challenges and implement the Metrics Server in your Linode LKE cluster.
What is metrics-server in Kubernetes
The Metrics Server is a tool for collecting and exposing basic resource usage metrics from Kubernetes nodes and pods.
It is required by some capabilities, including Horizontal Pod Autoscaler and Kubernetes Dashboards.
Unlike other monitoring tools such as Prometheus, Metrics Server does not store data long-term or provide advanced querying and alerting capabilities. The data collected is stored in memory and not persisted to disk. This means that the collected metrics are only available as long as the Metrics Server is running, and will be lost if the server is restarted or if the metrics are not retrieved in a timely manner.
However, it is easy to install and configure and has a low resource overhead.
Learn more about how Metrics Server works and how it gets implemented and its benefits.
Implementation
Helm Chart to use
For the Metrics Server, I recommend the chart from the Kubernetes Special Interest Group (SIG). Maintained by a community of contributors, their projects like ingress-nginx are exceptional.
You can find the Metrics Server chart on the GitHub page: https://github.com/kubernetes-sigs/metrics-server.
By clicking the "Releases" link, you can view the available chart versions.
Another option is to follow the chart releases on Artifact Hub, a central repository of Helm charts: https://artifacthub.io/packages/helm/metrics-server/metrics-server.
There are different ways to implement the Metrics Server in your environment, but I will focus on two methods.
At devoriales.com, we use Kubernetes as the runtime platform for our workloads and leverage CI/CD workflows with Terraform. I'm exposing the resource related to Metrics Server, which you can use as a reference.
Manual helm installation
This method assumes that you have the Helm cli installed on yor machine. It also assumes that you are authenticated and authorized to perform this action.
helm upgrade --install metrics-server bitnami/metrics-server \
--create-namespace --namespace metrics-server \
--set apiService.create=true \
--set 'args={--kubelet-insecure-tls,--kubelet-preferred-address-types=InternalIP}'
Terraform
Here is an example of the similar implementation via Terraform (this time the metrics server will be implemented in kube-system namespace):
resource "helm_release" "metrics_server" {
name = "metrics-server"
repository = "https://kubernetes-sigs.github.io/metrics-server/"
chart = "metrics-server"
version = "3.9.0"
namespace = "kube-system"
set {
name = "apiService.create"
value = "true"
}
set {
name = "args[0]"
value = "--kubelet-insecure-tls"
}
set {
name = "args[1]"
value = "--kubelet-preferred-address-types=InternalIP"
}
}
Explanation of Arguments
-
apiService.create:
A boolean value that determines whether to create the API for the metrics-server service.
You can verify the metrics-serve API service calledv1beta1.metrics.k8s.io
has been created:kubectl get apiservice NAME SERVICE AVAILABLE AGE ... v1beta1.metrics.k8s.io kube-system/metrics-server True 47m v1beta1.storage.k8s.io Local True 3h35m v1beta2.flowcontrol.apiserver.k8s.io Local True 3h35m v1beta3.flowcontrol.apiserver.k8s.io Local True 3h35m v2.autoscaling Local True 3h35m
-
kubelet-insecure-tls
: Disables TLS verification between the metrics-server and the kubelet API. This is necessary because Linode currently does not provide signed certificates for internal IP access to nodes. Metrics server needs to access the kubelet API via internal IP, hence the need to disable TLS verification.
Here is an example of why this is needed. Let's check the nodes in the cluster with kubectl get nodes -o wide
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
lke106889-159726-645753abe0c9 Ready <none> 124m v1.26.3 192.168.153.191 172.104.149.137 Debian GNU/Linux 11 (bullseye) 5.10.0-21-cloud-amd64 containerd://1.6.19
lke106889-159726-645753ac3fc9 Ready <none> 124m v1.26.3 192.168.167.150 139.162.138.154 Debian GNU/Linux 11 (bullseye) 5.10.0-21-cloud-amd64 containerd://1.6.19
lke106889-159726-645753ac9fb9 Ready <none> 123m v1.26.3 192.168.167.32 172.104.225.199 Debian GNU/Linux 11 (bullseye) 5.10.0-21-cloud-amd64 containerd://1.6.19
We will not be able to access the nodes via the hostnames since it's an internal hostname. Instead we can try to use external-ip address.
The following is a health check endpoint for the kubelet API server, which is responsible for managing and interacting with the containers on a Kubernetes node.
curl -k https://172.104.149.137:10250/healthz
Unauthorized% <<<< we don't have the certificate to authorize
Now we can see that we can reach the node, but since we don't have a valid certificate, we are not authorized.
We can at least check the cert used by kubelet:
Linux:
openssl s_client 172.104.149.137:10250
MacOS:
openssl s_client -connect 172.104.149.137:10250
the ip address is the external ip of the node lke106889-159726-645753abe0c9
Output:
CONNECTED(00000003)
depth=1 CN = lke106889-159726-645753abe0c9-ca@1683444762
verify error:num=19:self signed certificate in certificate chain
verify return:0
write W BLOCK
---
Certificate chain
0 s:/CN=lke106889-159726-645753abe0c9@1683444763
i:/CN=lke106889-159726-645753abe0c9-ca@1683444762
1 s:/CN=lke106889-159726-645753abe0c9-ca@1683444762
i:/CN=lke106889-159726-645753abe0c9-ca@1683444762
As we can see, there is a DNS name of the nodes listed, not IP, but just internal.
That is the reason why the metrics-server needs to run with kubelet-insecure-tls
The certificate used by the metrics-server endpoint does not include the IP address it tries to connect to. This causes the SSL/TLS certificate validation to fail, and as a result, the connection is not established.
--kubelet-insecure-tls
flag, tells metrics-server to skip certificate verification for the kubelet API endpoints. By default, Linode does not provide signed certificates for the node's internal IP addresses. I like linode's services, in this case there is a potential for improvement (as of now).
How does Metrics Server collect data?
It is responsible for gathering data through the Kubernetes API server, specifically CPU and memory usage metrics from the Summary API of each Node and Pod.
By exposing these metrics as Kubernetes API resources, Metrics Server periodically requests them from the Kubernetes API server and aggregates the data to provide valuable insights into the cluster's resource utilization. This information is crucial for users to optimize their clusters' performance and improve resource allocation.
Here is an example command to retrieve the Metrics Server metrics using curl
:
$ curl -k https://<Kubernetes_API_server>/apis/metrics.k8s.io/v1beta1/nodes
if you want to collect metrics about a specific node:
$ curl -k https://<Kubernetes_API_server>/apis/metrics.k8s.io/v1beta1/nodes/<node_name>
kubectl
equivalent
CPU and memory usage for all pods in the cluster:
kubectl top pods --all-namespaces
get CPU and memory usage for all nodes in the cluster:
kubectl top nodes
get CPU and memory usage for a specific pod:
kubectl top pods <pod-name> -n <namespace>
get CPU and memory usage for all pods in the cluster:
kubectl top pods <pod-name> --all-namespaces
How to understand the metrics?
The kubectl top nodes
command provides an overview of the CPU and memory usage of all nodes in the Kubernetes cluster. It shows the CPU usage in cores and the memory usage in bytes. This command can be useful to monitor the overall resource utilization of the cluster and identify any nodes that might be experiencing resource constraints.
Output:
kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
lke106889-159726-645753abe0c9 37m 3% 1110Mi 58%
lke106889-159726-645753ac3fc9 37m 3% 941Mi 49%
lke106889-159726-645753ac9fb9 176m 17% 980Mi 52%
- CPU(cores): The amount of CPU cores being used by the node
- CPU%: The percentage of CPU cores being used by the node
- MEMORY(bytes): The amount of memory being used by the node
- MEMORY%: The percentage of memory being used by the node
For those new to Kubernetes, it's uncommon to encounter millicores, which are one-thousandth of a core. To illustrate, a node with 4 cores equals 1000 millicores (4 cores x 1000 millicores per core). As a result, if a node uses 37m of CPU, it implies it's utilizing 37/1000 = 0.037 cores or roughly 3.7% of one core.
Summary
When it comes to implementing the Metrics Server in Linode Kubernetes Engine (LKE), there are some obstacles to overcome. Outdated Helm charts, which is not just related to LKE, and TLS can make the process challenging. However, the Kubernetes Special Interest Group (SIG) provides a chart that is maintained by a community of contributors which I strongly recommend.
To install the Metrics Server, you have to disable TLS verification between the Metrics Server and the kubelet API by setting the --kubelet-insecure-tls flag
. This is necessary in Linode because there is no signed certificate available when accessing the node via internal IP. We hope this will change in the future.
About the Author
Aleksandro Matejic, a Cloud Architect, began working in the IT industry over 21 years ago as a technical specialist, right after his studies. Since then, he has worked in various companies and industries in various system engineer and IT architect roles. He currently works on designing Cloud solutions, Kubernetes, and other DevOps technologies.
In his spare time, Aleksandro works on different development projects such as developing devoriales.com, a blog and learning platform launching in 2022/2023. In addition, he likes to read and write technical articles about software development and DevOps methods and tools.
You can contact Aleksandro by visiting his LinkedIn Profile