An Introduction to Using PostgreSQL
with Docker & Kubernetes
JONATHAN S. KATZ
JULY 19, 2018
LOS ANGELES POSTGRESQL USER GROUP
About Crunchy Data
2
• Leading provider of trusted open source PostgreSQL and
PostgreSQL related technologies, support, and training
to enterprises

• We're hiring!
• crunchydata.com

• @crunchydb
• Director of Communications, Crunchy Data

• Previously: Engineering leadership in startups

• Longtime PostgreSQL community contributor

• Advocacy & various committees for PGDG

• @postgresql + .org content

• Director, PgUS

• Co-Organizer, NYCPUG

• Conference organization + speaking

• @jkatz05
About Me
3
• Containers: A Brief History

• Containers + PostgreSQL

• Setting up PostgreSQL with Containers

• Deploying! - Container Orchestration

• Look Ahead: Trends in the Container World
Outline
4
• Containers are processes that
encapsulate all the requirements to
execute an application

• Similar to virtual machines,
Sandbox for applications similar to
a virtual machine but with
increased density on a single host
What Are Containers?
5
Source: Docker
• Container Image - the file that describes how to build a container

• Container Engine - prepares for container to be executed by container
runtime by collecting container images, accepting user input, preparing
mount points, etc. Examples: docker, CRI-O, RKT, LXD

• Container Runtime - Takes information passed from container engine and
sets up containerized process. Open Containers Initiative (OCI) helping to
standardize on runc
• Container - The runtime instantiation of a Container Image, i.e. a process!
Container Glossary
6 Source: https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
• Lightweight
• compared to virtual machines, use less disk, RAM, CPU

• Sandboxed
• Container runtime is isolated from other processes

• Portability
• Containers can be run on different platforms as long as container engine is available

• Convenience

• Requirements for running applications bundled together

• Prevents messy dependency overlaps
Why Containers?
7
Example: Basic Web Application
8
Example: Production Setup for Web Application
9
Example: Upgrading a Web Application
10
• Containers provide several advantages to running PostgreSQL:

• Setup & distribution for developer environments

• Ease of packaging extensions & minor upgrades

• Separate out secondary applications (monitoring, administration)

• Automation and scale for provisioning and creating replicas, backups
Containers & PostgreSQL
11
• Containers also introduce several challenges:

• Administrator needs to understand and select appropriate storage
options

• Configuration for individual database specifications and user access

• Managing 100s - 1000s of containers requires appropriate
orchestration (more on that later)

• Still a database within the container; standard DBA tuning applies

• However, these are challenges you will find in most database environments
Containers & PostgreSQL
12
• We will use the Crunchy Container Suite

• PostgreSQL (+ PostGIS): our favorite database; option to add our favorite geospatial extension

• pgpool + pgbouncer: connection pooling, load balancing

• pgbackrest: terabyte-scale backup management

• Monitoring: Prometheus + export

• Scheduling: "crunchy-dba"

• pgadmin4: UX-driven management

• Open source!

• Apache 2.0 license

• Support for Docker 1.12+, Kubernetes 1.5+

• Actively maintained and updated
Getting Started With Containers & PostgreSQL
13
https://github.com/CrunchyData/crunchy-containers
Getting Started With Containers & PostgreSQL
14
Demo: Creating & Working With Containerized PostgreSQL
15
mkdir postgres
cd postgres
docker volume create --driver local --name=pgvolume
docker network create --driver bridge pgnetwork
cat << EOF > pg-env.list
PG_MODE=primary
PG_PRIMARY_USER=postgres
PG_PRIMARY_PASSWORD=password
PG_DATABASE=whales
PG_USER=jkatz
PG_PASSWORD=password
PG_ROOT_PASSWORD=password
PG_PRIMARY_PORT=5432
EOF
docker run --publish 5432:5432 
--volume=pgvolume:/pgdata 
--env-file=pg-env.list 
--name="postgres" 
--hostname="postgres" 
--network="pgnetwork" 
--detach 
crunchydata/crunchy-postgres:centos7-10.4-2.0.0
Demo: Adding in pgadmin4
16
docker volume create --driver local --name=pga4volume
cat << EOF > pgadmin4-env.list
PGADMIN_SETUP_EMAIL=jonathan.katz@crunchydata.com
PGADMIN_SETUP_PASSWORD=securepassword
SERVER_PORT=5050
EOF
docker run --publish 5050:5050 
--volume=pga4volume:/var/lib/pgadmin 
--env-file=pgadmin4-env.list 
--name="pgadmin4" 
--hostname="pgadmin4" 
--network="pgnetwork" 
--detach 
crunchydata/crunchy-pgadmin4:centos7-10.4-2.0.0
Demo: Adding Monitoring
17
cat << EOF > collect-env.list
DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable
EOF
docker run 
--env-file=collect-env.list 
--network=pgnetwork 
--name=collect 
--hostname=collect 
--detach crunchydata/crunchy-collect:centos7-10.4-2.0.0
docker volume create --driver local --name=prometheus
cat << EOF > prometheus-env.list
COLLECT_HOST=collect
SCRAPE_INTERVAL=5s
SCRAPE_TIMEOUT=5s
EOF
docker run 
--publish 9090:9090 
--env-file=prometheus-env.list 
--volume prometheus:/data 
--network=pgnetwork 
--name=prometheus 
--hostname=prometheus 
--detach crunchydata/crunchy-prometheus:centos7-10.4-2.0.0
docker volume create --driver local --name=grafana
cat << EOF > grafana-env.list
ADMIN_USER=jkatz
ADMIN_PASS=password
PROM_HOST=prometheus
PROM_PORT=9090
EOF
docker run 
--publish 3000:3000 
--env-file=grafana-env.list 
--volume grafana:/data 
--network=pgnetwork 
--name=grafana 
--hostname=grafana 
--detach crunchydata/crunchy-grafana:centos7-10.4-2.0.0
1. Set up the metric collector
2. Set up prometheus to store metrics 3. Set up grafana to visualize
• Explored what / why / how of containers

• Set up a PostgreSQL 10 instance

• Set up pgadmin4 to manage our PostgreSQL instance

• Set up monitoring to analyze performance of our system

• Of course, the next question naturally is:
Recap
18
How do I manage these things
at scale?
• "Open-source system for
automating deployment, scaling,
and management of containerized
applications."

• Manage the full lifecycle of a
container

• Assists with scheduling, scaling,
failover, high-availability, and more
Kubernetes: Container Orchestration
20 Source: https://kubernetes.io
• Value of Kubernetes increases
exponentially as number of containers
increases

• Due to statefulness of databases,
Kubernetes requires more knowledge to
successfully operate a standard database
workload:

• Avoid scheduling and availability issues
for longer-running database containers

• Data continues to exist even if
container does not
When to Use Kubernetes
21
• Node: A Kubernetes "worker" machine that is able to run pods

• Pod: One or more running containers; the "atomic" unit of Kubernetes

• Service: The access point to a set of Pods

• ReplicaSet: Ensures that a specified number of replica Pods are running at a given time

• Deployment: A controller that ensures all running Pods / ReplicaSets match the desired
state of the execution environment (total number of pods, resources, etc.)

• Persistent Volume (PV): A storage API that enables information to persist after a Pod has
terminated

• Persistent Volume Claim (PVC): Enables a PV to be mounted to a container, includes
information such as amount of storage. Used for dynamic provisioning
Kubernetes Glossary Important for PostgreSQL
22 Source: https://kubernetes.io/docs/reference/glossary/?fundamental=true&storage=true
• Kubernetes provide the gateway to run your own "database-as-a-service:"

• Mass apply databases commands:

• Updates

• Backups / Restores

• ACL rule changes

• Scale up / down replicas

• Failover
23
PostgreSQL in a Kubernetes World
• Kubernetes is "turnkey" for stateless applications

• e.g. web servers

• Databases do maintain state: permanent storage

• Persistent Volumes (PV)

• Persistent Volume Claims (PVC)
PostgreSQL in a Kubernetes World
24
• Utilizes Operator framework initially launched by CoreOS to
help capture nuances of managing complex applications that
maintain state, e.g. databases

• Allows an administrator to run PostgreSQL-specific
commands to manage database clusters, including:

• Creating / Deleting a cluster (your own DBaaS)

• Scaling up / down replicas

• Failover

• Apply user policies to PostgreSQL instances

• Define what container resources to use (RAM, CPU, etc.)

• Smart pod deployments to nodes

• REST API
Crunchy PostgreSQL Operator
25
https://github.com/CrunchyData/postgres-operator
• Automation: Complex, multi-step DBA tasks
reduced to one-line commands

• Standardization: Many customizations, same
workflow

• Ease-of-Use: Simple CLI; UI in beta

• Scale
• Provision & manage clusters quickly
amongst thousands of instances

• Load balancing, disaster recovery, security
policies, deployment specifications

• Security: Sandboxed environments, RBAC,
mass grant/revoke policies
Why Use An Operator With PostgreSQL?
26
Demo: Perhaps Videos.
27
Demo: Exploring the Operator User Interface
28
Demo (Alternative): Exploring the Operator User Interface
29
• Containers are no longer "new" - orchestration technologies have matured

• Debate with containers + databases: storage & management

• No different than virtual machines + databases

• Databases are still databases: need expertise to manage

• Stateful Sets vs. Deployments

• Database deployment automation flexibility

• Deploy your architecture to any number of clouds

• Monitoring: A new frontier
Containerized PostgreSQL: Looking Ahead
30
• Containers + PostgreSQL gives you:

• Easy-to-setup development environments

• Your own production database-as-a-service

• Tools to automate management of over 1000s of instances in short-
order
Conclusion
31
Jonathan S. Katz
jonathan.katz@crunchydata.com
@jkatz05
Thank You!

Using PostgreSQL With Docker & Kubernetes - July 2018

  • 1.
    An Introduction toUsing PostgreSQL with Docker & Kubernetes JONATHAN S. KATZ JULY 19, 2018 LOS ANGELES POSTGRESQL USER GROUP
  • 2.
    About Crunchy Data 2 •Leading provider of trusted open source PostgreSQL and PostgreSQL related technologies, support, and training to enterprises • We're hiring! • crunchydata.com • @crunchydb
  • 3.
    • Director ofCommunications, Crunchy Data • Previously: Engineering leadership in startups • Longtime PostgreSQL community contributor • Advocacy & various committees for PGDG • @postgresql + .org content • Director, PgUS • Co-Organizer, NYCPUG • Conference organization + speaking • @jkatz05 About Me 3
  • 4.
    • Containers: ABrief History • Containers + PostgreSQL • Setting up PostgreSQL with Containers • Deploying! - Container Orchestration • Look Ahead: Trends in the Container World Outline 4
  • 5.
    • Containers areprocesses that encapsulate all the requirements to execute an application • Similar to virtual machines, Sandbox for applications similar to a virtual machine but with increased density on a single host What Are Containers? 5 Source: Docker
  • 6.
    • Container Image- the file that describes how to build a container • Container Engine - prepares for container to be executed by container runtime by collecting container images, accepting user input, preparing mount points, etc. Examples: docker, CRI-O, RKT, LXD • Container Runtime - Takes information passed from container engine and sets up containerized process. Open Containers Initiative (OCI) helping to standardize on runc • Container - The runtime instantiation of a Container Image, i.e. a process! Container Glossary 6 Source: https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
  • 7.
    • Lightweight • comparedto virtual machines, use less disk, RAM, CPU • Sandboxed • Container runtime is isolated from other processes • Portability • Containers can be run on different platforms as long as container engine is available • Convenience • Requirements for running applications bundled together • Prevents messy dependency overlaps Why Containers? 7
  • 8.
    Example: Basic WebApplication 8
  • 9.
    Example: Production Setupfor Web Application 9
  • 10.
    Example: Upgrading aWeb Application 10
  • 11.
    • Containers provideseveral advantages to running PostgreSQL: • Setup & distribution for developer environments • Ease of packaging extensions & minor upgrades • Separate out secondary applications (monitoring, administration) • Automation and scale for provisioning and creating replicas, backups Containers & PostgreSQL 11
  • 12.
    • Containers alsointroduce several challenges: • Administrator needs to understand and select appropriate storage options • Configuration for individual database specifications and user access • Managing 100s - 1000s of containers requires appropriate orchestration (more on that later) • Still a database within the container; standard DBA tuning applies • However, these are challenges you will find in most database environments Containers & PostgreSQL 12
  • 13.
    • We willuse the Crunchy Container Suite • PostgreSQL (+ PostGIS): our favorite database; option to add our favorite geospatial extension • pgpool + pgbouncer: connection pooling, load balancing • pgbackrest: terabyte-scale backup management • Monitoring: Prometheus + export • Scheduling: "crunchy-dba" • pgadmin4: UX-driven management • Open source! • Apache 2.0 license • Support for Docker 1.12+, Kubernetes 1.5+ • Actively maintained and updated Getting Started With Containers & PostgreSQL 13 https://github.com/CrunchyData/crunchy-containers
  • 14.
    Getting Started WithContainers & PostgreSQL 14
  • 15.
    Demo: Creating &Working With Containerized PostgreSQL 15 mkdir postgres cd postgres docker volume create --driver local --name=pgvolume docker network create --driver bridge pgnetwork cat << EOF > pg-env.list PG_MODE=primary PG_PRIMARY_USER=postgres PG_PRIMARY_PASSWORD=password PG_DATABASE=whales PG_USER=jkatz PG_PASSWORD=password PG_ROOT_PASSWORD=password PG_PRIMARY_PORT=5432 EOF docker run --publish 5432:5432 --volume=pgvolume:/pgdata --env-file=pg-env.list --name="postgres" --hostname="postgres" --network="pgnetwork" --detach crunchydata/crunchy-postgres:centos7-10.4-2.0.0
  • 16.
    Demo: Adding inpgadmin4 16 docker volume create --driver local --name=pga4volume cat << EOF > pgadmin4-env.list PGADMIN_SETUP_EMAIL=jonathan.katz@crunchydata.com PGADMIN_SETUP_PASSWORD=securepassword SERVER_PORT=5050 EOF docker run --publish 5050:5050 --volume=pga4volume:/var/lib/pgadmin --env-file=pgadmin4-env.list --name="pgadmin4" --hostname="pgadmin4" --network="pgnetwork" --detach crunchydata/crunchy-pgadmin4:centos7-10.4-2.0.0
  • 17.
    Demo: Adding Monitoring 17 cat<< EOF > collect-env.list DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable EOF docker run --env-file=collect-env.list --network=pgnetwork --name=collect --hostname=collect --detach crunchydata/crunchy-collect:centos7-10.4-2.0.0 docker volume create --driver local --name=prometheus cat << EOF > prometheus-env.list COLLECT_HOST=collect SCRAPE_INTERVAL=5s SCRAPE_TIMEOUT=5s EOF docker run --publish 9090:9090 --env-file=prometheus-env.list --volume prometheus:/data --network=pgnetwork --name=prometheus --hostname=prometheus --detach crunchydata/crunchy-prometheus:centos7-10.4-2.0.0 docker volume create --driver local --name=grafana cat << EOF > grafana-env.list ADMIN_USER=jkatz ADMIN_PASS=password PROM_HOST=prometheus PROM_PORT=9090 EOF docker run --publish 3000:3000 --env-file=grafana-env.list --volume grafana:/data --network=pgnetwork --name=grafana --hostname=grafana --detach crunchydata/crunchy-grafana:centos7-10.4-2.0.0 1. Set up the metric collector 2. Set up prometheus to store metrics 3. Set up grafana to visualize
  • 18.
    • Explored what/ why / how of containers • Set up a PostgreSQL 10 instance • Set up pgadmin4 to manage our PostgreSQL instance • Set up monitoring to analyze performance of our system • Of course, the next question naturally is: Recap 18
  • 19.
    How do Imanage these things at scale?
  • 20.
    • "Open-source systemfor automating deployment, scaling, and management of containerized applications." • Manage the full lifecycle of a container • Assists with scheduling, scaling, failover, high-availability, and more Kubernetes: Container Orchestration 20 Source: https://kubernetes.io
  • 21.
    • Value ofKubernetes increases exponentially as number of containers increases • Due to statefulness of databases, Kubernetes requires more knowledge to successfully operate a standard database workload: • Avoid scheduling and availability issues for longer-running database containers • Data continues to exist even if container does not When to Use Kubernetes 21
  • 22.
    • Node: AKubernetes "worker" machine that is able to run pods • Pod: One or more running containers; the "atomic" unit of Kubernetes • Service: The access point to a set of Pods • ReplicaSet: Ensures that a specified number of replica Pods are running at a given time • Deployment: A controller that ensures all running Pods / ReplicaSets match the desired state of the execution environment (total number of pods, resources, etc.) • Persistent Volume (PV): A storage API that enables information to persist after a Pod has terminated • Persistent Volume Claim (PVC): Enables a PV to be mounted to a container, includes information such as amount of storage. Used for dynamic provisioning Kubernetes Glossary Important for PostgreSQL 22 Source: https://kubernetes.io/docs/reference/glossary/?fundamental=true&storage=true
  • 23.
    • Kubernetes providethe gateway to run your own "database-as-a-service:" • Mass apply databases commands: • Updates • Backups / Restores • ACL rule changes • Scale up / down replicas • Failover 23 PostgreSQL in a Kubernetes World
  • 24.
    • Kubernetes is"turnkey" for stateless applications • e.g. web servers • Databases do maintain state: permanent storage • Persistent Volumes (PV) • Persistent Volume Claims (PVC) PostgreSQL in a Kubernetes World 24
  • 25.
    • Utilizes Operatorframework initially launched by CoreOS to help capture nuances of managing complex applications that maintain state, e.g. databases • Allows an administrator to run PostgreSQL-specific commands to manage database clusters, including: • Creating / Deleting a cluster (your own DBaaS) • Scaling up / down replicas • Failover • Apply user policies to PostgreSQL instances • Define what container resources to use (RAM, CPU, etc.) • Smart pod deployments to nodes • REST API Crunchy PostgreSQL Operator 25 https://github.com/CrunchyData/postgres-operator
  • 26.
    • Automation: Complex,multi-step DBA tasks reduced to one-line commands • Standardization: Many customizations, same workflow • Ease-of-Use: Simple CLI; UI in beta • Scale • Provision & manage clusters quickly amongst thousands of instances • Load balancing, disaster recovery, security policies, deployment specifications • Security: Sandboxed environments, RBAC, mass grant/revoke policies Why Use An Operator With PostgreSQL? 26
  • 27.
  • 28.
    Demo: Exploring theOperator User Interface 28
  • 29.
    Demo (Alternative): Exploringthe Operator User Interface 29
  • 30.
    • Containers areno longer "new" - orchestration technologies have matured • Debate with containers + databases: storage & management • No different than virtual machines + databases • Databases are still databases: need expertise to manage • Stateful Sets vs. Deployments • Database deployment automation flexibility • Deploy your architecture to any number of clouds • Monitoring: A new frontier Containerized PostgreSQL: Looking Ahead 30
  • 31.
    • Containers +PostgreSQL gives you: • Easy-to-setup development environments • Your own production database-as-a-service • Tools to automate management of over 1000s of instances in short- order Conclusion 31
  • 32.