Monitoring on Kubernetes using prometheus

Monitoring on Kubernetes
using Prometheus
Chandresh Pancholi
Engineer at AI

Kubernetes at Arvind Internet
● Our Infra is deployed on AWS
● Kubernetes minions are running on m4.xlarge instances
● Kubernetes version 1.7.5 in QA/Prod, 1.8.3 on Pre-prod
● QA/Dev, Pre-Prod & Production running on Kubernetes
● Total Pods ⇒ More than 350 (QA/Dev, Prod)
● Total services ⇒ More than 200 (QA/Dev, Prod)
● Running Mongo, MySQL, Redis, Hazelcast in Kubernetes in QA/Dev

What is Kubernetes?
Kubernetes is an open-source container orchestration engine and also an
abstraction layer for managing full stack operations of hosts and containers.
From deployment, Scaling, Load Balancing and to rolling updates of
containerized applications across multiple hosts within a cluster. Kubernetes
make sure that your applications are in the desired state.

Master: The machine that controls Kubernetes nodes. This is where all task assignments
originate.
Node: These machines perform the requested, assigned tasks. The Kubernetes master
controls them.
Deployments: Provides declarative updates for
Pod: A group of one or more containers deployed to a single node. All containers in a pod
share an IP address, IPC, hostname, and other resources. Pods abstract network and
storage away from the underlying container. This lets you move containers around the
cluster more easily.

Service: This decouples work definitions from the pods. Kubernetes service
proxies automatically get service requests to the right pod—no matter where it
moves to in the cluster or even if it’s been replaced.
Config maps : ConfigMaps allow you to decouple configuration artifacts from
image content to keep containerized applications portable
Secrets: Secret are intended to hold sensitive information, such as passwords,
OAuth tokens, and ssh keys. Putting this information in a secret is safer and
more flexible than putting it verbatim in a pod definition or in a docker image

Monitoring at AI (earlier)
EC2
Sensu
Kubernetes
µServices

Cons
1. Multiple monitoring system
2. Difficulty in troubleshooting
3. Additional Infrastructure cost to support three monitoring system
4. Graphite doesn’t provide pod level Application metrics
5. Infra team need to understand Sensu, Prometheus alerting
6. Application metrics are single dimension Ex. (a.b.c.d.99)
7. Grafana alerting for Application metrics

Prometheus
● It developed at SoundCloud by ex-Googlers
● Prometheus is a close cousin of Kubernetes
● A multi-dimensional data model with time series data identified by metric
name and key/value pairs
● Alerting and graphing are unified, using the same language.
● Time series collection happens via a pull model over HTTP
● Targets are discovered via service discovery or static configuration
● Provides multiple exporters to send AWS EC2, Kafka, Mongo, Cassandra,
RMQ, Redis metrics

Sample metrics
{endpoint="http",instance="100.110.140.82:8080",job="hello",namespace="defau
lt",pod="hello-946046218-397x2",service="hello-world"}
{endpoint="http",instance="100.98.66.79:8080",job="hello",namespace="default",
pod="hello-946046218-5h39f",service="hello-world"}

node_exporter
Prometheus exporter for hardware and OS metrics exposed by *NIX kernels,
written in Go with pluggable metric collectors.

Metrics
● CPU (system, user, nice, iowait, steal, idle, irq, softirq, guest)
● Memory (Apps, Buffers, Cached, Free, Sla, SwapCached, PageTables, VmallocUser, Swap, Committed, Mapped,
Active, Inactive)
● Load
● Disk Space Used in percent
● Disk Utilization per Device
● Disk IOS per device (read, write)
● Disk Throughput per Device (read, write)
● Context Switches
● Network Traffic (In, Out)
● Netstat (Established)
● UDP stats (InDatagrams, InErrors, OutDatagrams, NoPorts)
● Conntrack

AWS EC2 config
Relabelling Tags
__meta_ec2_availability_zone Availability zone
__meta_ec2_instance_id Instance Id
__meta_ec2_instance_state Instance state
__meta_ec2_instance_type Instance type
__meta_ec2_private_ip Private ip
__meta_ec2_public_dns_name Public DNS Name
__meta_ec2_public_ip Public IP
__meta_ec2_tag_<tagkey> Custom Tag key

Approach #1 - Prometheus on EC2
EC2
Kubernetes
node ex
µServices
AWS EC2

#1. Getting EC2 server metrics is quite easy and straightforward. Prometheus
provides EC2 discovery.
#2. Getting Kubernetes and Application metrics is very complex. It has 300+
lines of configuration to support just Kubernetes metrics

Approach #2. Use Prometheus operator

What is Prometheus operator?
The Prometheus Operator creates, configures, and manages Prometheus
monitoring instances. Automatically generates monitoring target configurations
based on familiar Kubernetes label queries.

Service monitor Custom Resource Definition(CRD)

Prometheus Custom Resource Definition (CRD)

Monitoring on Kubernetes using prometheus

Monitoring on Kubernetes using prometheus

More Related Content

What's hot

Similar to Monitoring on Kubernetes using prometheus

More from Chandresh Pancholi

Recently uploaded

In this document

Monitoring on Kubernetes using prometheus