Using containerized databases in Kubernetes is quick and convenient. But how can we ensure backups and restores work? Let's examine this in my favourite chart lately, Sonarqube.
Assumptions:
- you have AKS running with Sonarqube (helm install stable/sonarqube)
- use you are using the (default) PostgreSQL containerized database
Identify your PostgreSQL pod
Log in as admin and look at Admin/System under Database:
Log into Kubernetes;
$ az login
$ az aks get-credentials --name my-aks-cluster --resource-group my-aks-rg
Get the pods
Get the pods
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
xxxxxxx-redis-infra-master-0 1/1 Running 0 106d
xxxxxxx-redis-infra-slave-746fb58f8f-vlbwl 1/1 Running 4 106d
xxxxxxx-sonarqube-db-infra-postgresql-5f6c845645-qknjq 1/1 Running 0 43d
xxxxxxx-sonarqube-db-infra-sonarqube-64d465d5bb-4n74t 0/1 Evicted 0 43d
xxxxxxx-sonarqube-db-infra-sonarqube-64d465d5bb-lt5c9 1/1 Running 0 1d
xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7 1/1 Running 0 43d
xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-whbdc 1/1 Running 1 43d
elasticsearch-operator-sysctl-vckw5 1/1 Running 0 1d
elasticsearch-operator-sysctl-z54kz 1/1 Running 0 1d
We see xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7 matches our JDBC string.
Getting a PSQL backup
There are a few ways, but the easiest is to use the utilities we know exist on the database pod.
$ kubectl exec -it xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7 -- /bin/sh
# pg_dump sonarDB -f /tmp/db.out
# ls -l /tmp/db.out
-rw-r--r-- 1 root root 4938992 Mar 20 12:15 /tmp/db.out
# ls -lh /tmp/db.out
-rw-r--r-- 1 root root 4.8M Mar 20 12:15 /tmp/db.out
# head -n10 /tmp/db.out
--
-- PostgreSQL database dump
We can also dump all (though we really only need the sonarDB database - sonar doesnt create extra database users)
# pg_dumpall -f /tmp/dball.out
# ls -lh /tmp/db*
-rw-r--r-- 1 root root 4.8M Mar 20 12:16 /tmp/dball.out
-rw-r--r-- 1 root root 4.8M Mar 20 12:15 /tmp/db.out
NOTE: Azure Database for PostgreSQL will want a compressed version (useful if moving to a PaaS offering)
$ pg_dump -Fc -v --dbname=sonarDB > /tmp/114-try2.dump
Exit the pod and copy the files locally:
$ kubectl cp xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7:/tmp/db.out ./92-db.out
$ kubectl cp xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7:/tmp/dball.out ./92-dball.out
At this point we have a point in time backup of the containerized database.
Validation
Delete the pod and restore.. This should reset the database and our instance should be hosed until our restore…
$ kubectl delete pod xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7
pod "xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-7twk7" deleted
$ kubectl get pods | grep xxxxxxx-sonarqube-infra-postgresql
xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l 0/1 ContainerCreating 0 1m
$ kubectl get pods | grep xxxxxxx-sonarqube-infra-postgresql
xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l 1/1 Running 0 15m
Ah, but the Sonar Service did do a persistent volume claim so we did not lose any data!
Let’s validate we can restore.
MUG testing
(malicious user group - im sure there are other names for this type of testing, but this was the term i tend to use)
Restoring to container database
First we copy the backup to the new pod then we have to login and kill connections so we can drop and restore the existing database.
$ kubectl cp ./92-db.out xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l:/tmp/db.out
$ kubectl exec -it xxxxxxx-sonarqube-infra-postgresql-6f58d97bdc-dtq8l -- /bin/sh
# psql postgres
psql (9.6.2)
Type "help" for help.
postgres=# UPDATE pg_database SET datallowconn = 'false' WHERE datname = 'sonarDB';
UPDATE 1
postgres=# SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'sonarDB';
pg_terminate_backend
----------------------
t
t
t
t
(4 rows)
postgres=# DROP DATABASE "sonarDB";
DROP DATABASE
\quit
# createdb sonarDB
# psql sonarDB < /tmp/db.out
Lastly we have to turn back on connections (or Sonarqube won't be able to use it):
postgres=# UPDATE pg_database SET datallowconn = 'true' WHERE datname = 'sonarDB';
UPDATE 1
postgres=# \quit
Now, you might think a simple restart would cut it;
But the server caches a lot of data for performance. The only surefire way to make sure that SonarQube pulls fresh database data is to kill the SonarQube pod itself and refresh:
$ kubectl delete pod xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-whbdc
pod "xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-whbdc" deleted
$ kubectl get pods | grep sonarqube-infra-sonarqube
xxxxxxx-sonarqube-infra-sonarqube-78cdf5f6fd-7tfmp 1/1 Running 0 3m
When the service comes back, we indeed see the deleted project has been restored; validating our restore process.
Loading Azure Database for PostreSQL with data from Containerized Backup.
First, make sure to backup from the pod with compression if you didn't earlier
pg_dump -Fc -v --dbname=sonarDB > /tmp/mydump.dump
The you can SSH into the pod to do a restore (or do it locally). You need to force SSL if your Azure Database instance requires it (and frankly, you should)
$ export PGSSLMODE=require
$ pg_restore -v --no-owner --host=sonar.postgres.database.azure.com --port=5432 --username=sonar@sonar --dbname=sonarDB ./mydump.dump