Backing up and Restoring Containerized PostgreSQL databases

Using containerized databases in Kubernetes is quick and convenient.  But how can we ensure backups and restores work?  Let's examine this in my favourite chart lately, Sonarqube.

Assumptions:

  1. you have AKS running with Sonarqube (helm install stable/sonarqube)
  2. use you are using the (default) PostgreSQL containerized database

Identify your PostgreSQL pod

Log in as admin and look at Admin/System under Database:

Log into Kubernetes;

$ az login
$ az aks get-credentials --name my-aks-cluster --resource-group my-aks-rg

Get the pods

Get the pods

$ kubectl get pods
NAME                                                       READY     STATUS    RESTARTS   AGE
xxxxxxx-redis-infra-master-0                             1/1       Running   0          106d
xxxxxxx-redis-infra-slave-746fb58f8f-vlbwl               1/1       Running   4          106d
xxxxxxx-sonarqube-db-infra-postgresql-5f6c845645-qknjq   1/1       Running   0          43d
xxxxxxx-sonarqube-db-infra-sonarqube-64d465d5bb-4n74t    0/1       Evicted   0          43d
xxxxxxx-sonarqube-db-infra-sonarqube-64d465d5bb-lt5c9    1/1       Running   0          1d
xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7      1/1       Running   0          43d
xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-whbdc       1/1       Running   1          43d
elasticsearch-operator-sysctl-vckw5                        1/1       Running   0          1d
elasticsearch-operator-sysctl-z54kz                        1/1       Running   0          1d

We see xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7 matches our JDBC string.

Getting a PSQL backup

There are a few ways, but the easiest is to use the utilities we know exist on the database pod.

$ kubectl exec -it xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7 -- /bin/sh

# pg_dump sonarDB -f /tmp/db.out
# ls -l /tmp/db.out
-rw-r--r-- 1 root root 4938992 Mar 20 12:15 /tmp/db.out
# ls -lh /tmp/db.out
-rw-r--r-- 1 root root 4.8M Mar 20 12:15 /tmp/db.out
# head -n10 /tmp/db.out
--
-- PostgreSQL database dump

We can also dump all (though we really only need the sonarDB database - sonar doesnt create extra database users)

# pg_dumpall -f /tmp/dball.out
# ls -lh /tmp/db*
-rw-r--r-- 1 root root 4.8M Mar 20 12:16 /tmp/dball.out
-rw-r--r-- 1 root root 4.8M Mar 20 12:15 /tmp/db.out

NOTE: Azure Database for PostgreSQL will want a compressed version (useful if moving to a PaaS offering)

$ pg_dump -Fc -v --dbname=sonarDB > /tmp/114-try2.dump

Exit the pod and copy the files locally:

$ kubectl cp xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7:/tmp/db.out ./92-db.out
$ kubectl cp xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7:/tmp/dball.out ./92-dball.out

At this point we have a point in time backup of the containerized database.

Validation

Delete the pod and restore.. This should reset the database and our instance should be hosed until our restore…

$ kubectl delete pod xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7
pod "xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7" deleted
$ kubectl get pods | grep  xxxxxxx-sonarqube-infra-postgresql
xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l      0/1       ContainerCreating   0          1m
$ kubectl get pods | grep  xxxxxxx-sonarqube-infra-postgresql
xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l      1/1       Running   0          15m

Ah, but the Sonar Service did do a persistent volume claim so we did not lose any data!

Let’s validate we can restore.

MUG testing

(malicious user group - im sure there are other names for this type of testing, but this was the term i tend to use)

Bad admin deletes a project
confirms

Restoring to container database

First we copy the backup to the new pod then we have to login and kill connections so we can drop and restore the existing database.

$ kubectl cp ./92-db.out xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l:/tmp/db.out
$ kubectl exec -it xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l -- /bin/sh
# psql postgres
psql (9.6.2)
Type "help" for help.

postgres=# UPDATE pg_database SET datallowconn = 'false' WHERE datname = 'sonarDB';
UPDATE 1

postgres=# SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'sonarDB';
 pg_terminate_backend
----------------------
 t
 t
 t
 t
(4 rows)

postgres=# DROP DATABASE "sonarDB";
DROP DATABASE
 \quit

# createdb sonarDB
# psql sonarDB < /tmp/db.out

Lastly we have to turn back on connections (or Sonarqube won't be able to use it):

postgres=# UPDATE pg_database SET datallowconn = 'true' WHERE datname = 'sonarDB';
UPDATE 1
postgres=# \quit

Now, you might think a simple restart would cut it;

restarting Sonarqube from admin control panel

But the server caches a lot of data for performance.  The only surefire way to make sure that SonarQube pulls fresh database data is to kill the SonarQube pod itself and refresh:

$ kubectl delete pod xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-whbdc
pod "xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-whbdc" deleted
$ kubectl get pods | grep sonarqube-infra-sonarqube
xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-7tfmp       1/1       Running   0          3m

When the service comes back, we indeed see the deleted project has been restored; validating our restore process.

restored data

Loading Azure Database for PostreSQL with data from Containerized Backup.

First, make sure to backup from the pod with compression if you didn't earlier

pg_dump -Fc -v --dbname=sonarDB > /tmp/mydump.dump

The you can SSH into the pod to do a restore (or do it locally). You need to force SSL if your Azure Database instance requires it (and frankly, you should)

$ export PGSSLMODE=require
$ pg_restore -v --no-owner --host=sonar.postgres.database.azure.com --port=5432 --username=sonar@sonar --dbname=sonarDB ./mydump.dump