Deploying Prometheus Multi-Cluster monitoring using Prometheus Agent Mode
In the previous post I wrote about Prometheus Multi-cluster monitoring and how using Prometheus in agent mode helps create a single pane of glass for monitoring multi Kubernetes clusters. Therefore, if you haven’t read it yet, please read it before this hands-on post. In this post, we are going to deploy Prometheus agent mode along with a Prometheus global view and test how these work together in action. For this tutorial, you need a Kubernetes Cluster and two separate namespaces, monitoring-global, and monitoring. Are you ready to run?!
Photo by Andrea Piacquadio from Pexels
Deploy the Global view Prometheus
First make sure that you have already created both needed namespaces. otherwise, create them with the following command:
kubectl create ns monitoring-global && kubectl create ns monitoring
All the files I am using to deploy on the test cluster are available to download or clone on my Github.
1- Deploy a config map to be used in the Prometheus deployment. This config map creates prometheus.rules
which defines rule statements and prometheus.yml
which is the configuration file.
kubectl apply -f prometheus-global-view/config-map-global.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-global-conf
labels:
name: prometheus-global-conf
namespace: monitoring-global
data:
prometheus.rules: |-
groups:
- name: devopscube demo alert
rules:
- alert: High Pod Memory
expr: sum(container_memory_usage_bytes) > 1
for: 1m
labels:
severity: slack
annotations:
summary: High Memory Usage
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager.monitoring.svc:9093"
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['127.0.0.1:9091']
2- Deploy the Prometheus deployment. Since it is only for tests and not that much data is going to be stored here, we add an emptyDir
Volume and also we enable the remote-write-receiver
which allows Prometheus to accept remote write requests from other Prometheus servers.
kubectl apply -f prometheus-global-view/prometheus-global-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
namespace: monitoring-global
labels:
app: prometheus-global
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-global
template:
metadata:
labels:
app: prometheus-global
spec:
containers:
- name: prometheus
image: prom/prometheus
args:
- "--storage.tsdb.retention.time=12h"
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus/"
- "--web.enable-remote-write-receiver"
- "--web.enable-lifecycle"
ports:
- containerPort: 9090
resources:
requests:
cpu: 500m
memory: 500M
limits:
cpu: 1
memory: 1Gi
volumeMounts:
- name: prometheus-global-config-volume
mountPath: /etc/prometheus/
- name: prometheus-storage-volume
mountPath: /prometheus/
volumes:
- name: prometheus-global-config-volume
configMap:
defaultMode: 420
name: prometheus-global-conf
- name: prometheus-storage-volume
emptyDir: {}
If you get the running pods, you see the created pod.
$kubectl get pods -n monitoring-global
NAME READY STATUS RESTARTS AGE
prometheus-deployment-6d84cb9b8b-5r2zb 1/1 Running 0 2m35s
3- It is better to create a headless service to use as the remote-write endpoint in the agent-mode namespace by the agent-mode Prometheus.
kubectl apply -f prometheus-global-view/headless-service.yml
apiVersion: v1
kind: Service
metadata:
name: prometheus-global-headless-service
namespace: monitoring-global
spec:
clusterIP: None
selector:
app: prometheus-global
ports:
- protocol: TCP
port: 9090
targetPort: 9090
Now that headless-service is created, you can simply forward a local port to it and access the running Prometheus global view.
$kubectl port-forward svc/prometheus-global-headless-service 9090:9090 -n monitoring-global
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from \[::1\]:9090 -> 9090
Call the health check endpoint to make sure that everything is working. Or you can browse [http://localhost:9090](http://localhost:9090)
on your browser.
$curl [http://localhost:9090/-/healthy](http://localhost:9090/-/healthy) Prometheus Server is Healthy.
Deploy the Agent-mode Prometheus
Now it is time to deploy Prometheus in the agent mode and remote-write the metrics to the global-view one.
1- Create a ClusterRole and a ClusterRoleBinding for Prometheus to be able to scrape some Kubernetes metrics.
kubectl apply -f prometheus-agent/clusterrole.yml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: default
namespace: monitoring
2- We need to create a config map to be used as the agent-mode Prometheus server configuration file. Here we are adding the remote_write
endpoint.
kubectl apply -f prometheus-agent/config-map.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: monitoring
data:
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
remote_write:
- url: 'http://prometheus-global-headless-service.monitoring-global.svc.cluster.local:9090/api/v1/write'
3- Now we deployed all the prerequisites for the agent-mode deployment, finally, we can deploy the agent mode Prometheus. We are enabling the agent mode installation by passing the --enable-feature=agent
argument.
kubectl apply -f prometheus-agent/prometheus-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
namespace: monitoring
labels:
app: prometheus-server
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--web.enable-lifecycle"
- "--enable-feature=agent"
ports:
- containerPort: 9090
resources:
requests:
cpu: 500m
memory: 500M
limits:
cpu: 1
memory: 1Gi
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus/
- name: prometheus-storage-volume
mountPath: /prometheus/
volumes:
- name: prometheus-config-volume
configMap:
defaultMode: 420
name: prometheus-server-conf
- name: prometheus-storage-volume
emptyDir: {}
By deploying the last piece, now the Prometheus agent mode is running and you can confirm it by forwarding its pod port to your localhost.
$kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
prometheus-deployment-fd7f6557c-tvsjj 1/1 Running 0 $kubectl port-forward prometheus-deployment-fd7f6557c-tvsjj 9080:9090 -n monitoring
Forwarding from 127.0.0.1:9080 -> 9090
Forwarding from \[::1\]:9080 -> 9090
And now if you browse localhost:9080
on the browser you get the following page which shows that the Prometheus is running in the agent mode.
And now, if you port forward the global view again and query for the available metrics, you will see metrics shipped from the agent one accessible on it.
Conclusion
Prometheus in agent mode is useful when you want to monitor multiple clusters in a single pane of glass. But it is not only the case, these days many companies are moving forward to implement edge computing. It is the era of IoT, self-driving cars, and many other models that you can deploy a Kubernetes cluster in a resource-bounded device.
Who am I?
I am Ehsan, a passionate site reliability engineer and a cloud solutions architect working for Techspire Netherlands and dedicated to assisting businesses smoothing out their System Engineering and Security Operations, improving availability, scalability, and QoS for their services using infrastructure as code concepts, a wide range of monitoring and logging tools, digging into Linux based operating systems, using my deep knowledge in Network concepts and big-data analytical tools.