AKS and NewRelic

Published: Apr 13, 2020 by Isaac Johnson

As we look at various APM and Logging tools, one suite that deserves attention is NewRelic.  New Relic is not a new player, having been founded in 2008 by Lew Cirne (anagram of ‘new relic’) and going public in 2014.  

Let’s do two things - let’s check out the very latest Kubernetes Azure offers, v1.17.3 (in preview) and see if we can apply New Relic monitoring.

AKS Setup.

Since we’ve covered setting up Azure Kubernetes Service a multitude of times, we’ll just keep it summary below:

builder@DESKTOP-2SQ9NQM:~$ az aks list
[]
builder@DESKTOP-2SQ9NQM:~$ az group create --name idjaks05rg --location centralus
{
  "id": "/subscriptions/70b42e6a-6faf-4fed-bcec-9f3995b1aca8/resourceGroups/idjaks05rg",
  "location": "centralus",
  "managedBy": null,
  "name": "idjaks05rg",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}
builder@DESKTOP-2SQ9NQM:~$ az ad sp create-for-rbac -n idjaks05sp --skip-assignment --output json > my_sp.json
Changing "idjaks05sp" to a valid URI of "http://idjaks05sp", which is the required format used for service principal names
builder@DESKTOP-2SQ9NQM:~$ cat my_sp.json | jq -r .appId
1878c818-33e7-4ea9-996c-afc80d57001c

builder@DESKTOP-2SQ9NQM:~$ export SP_ID=`cat my_sp.json | jq -r .appId`
builder@DESKTOP-2SQ9NQM:~$ export SP_PASS=`cat my_sp.json | jq -r .password`

builder@DESKTOP-2SQ9NQM:~$ az aks create --resource-group idjaks05rg --name idjaks05 --location centralus --kubernetes-version 1.17.3 --enable-rbac --node-count 3 --enable-cluster-autoscaler --min-count 2 --max-count 5 --generate-ssh-
keys --network-plugin azure --network-policy azure --service-principal $SP_ID --client-secret $SP_PASS
Argument 'enable_rbac' has been deprecated and will be removed in a future release. Use '--disable-rbac' instead.
 - Running ..

Some of the things you’ll notice above is we explicitly are using k8s version 1.17.3.

New Relic Prerequisites

We’ll need to create a New Relic account if we haven’t already.  What we need is the License Key which will tie our agents to this account.

You can find your New Relic key in the account settings page:

Setting up Agents in AKS:

Next, let’s log into our cluster

builder@DESKTOP-JBA79RT:~$ az aks list -o table
Name Location ResourceGroup KubernetesVersion ProvisioningState Fqdn
-------- ---------- --------------- ------------------- ------------------- -----------------------------------------------------------
idjaks05 centralus idjaks05rg 1.17.3 Succeeded idjaks05-idjaks05rg-70b42e-8037c484.hcp.centralus.azmk8s.io
builder@DESKTOP-JBA79RT:~$ az aks get-credentials -n idjaks05 -g idjaks05rg --admin
Merged "idjaks05-admin" as current context in /home/builder/.kube/config

Following this guide, we’re going to install the newrelic agents.

A quick note. The guide will have you download releases to match your cluster version:

For master, what you’ll want to do is clone the repo and use the examples/standard folder.  This is necessary for the very latest (1.17).

builder@DESKTOP-JBA79RT:~/Workspaces$ git clone https://github.com/kubernetes/kube-state-metrics.git
Cloning into 'kube-state-metrics'...
remote: Enumerating objects: 429, done.
remote: Counting objects: 100% (429/429), done.
remote: Compressing objects: 100% (318/318), done.
remote: Total 18587 (delta 103), reused 238 (delta 87), pack-reused 18158
Receiving objects: 100% (18587/18587), 16.47 MiB | 16.81 MiB/s, done.
Resolving deltas: 100% (11491/11491), done.
builder@DESKTOP-JBA79RT:~/Workspaces$ cd kube-state-metrics
builder@DESKTOP-JBA79RT:~/Workspaces/kube-state-metrics$ kubectl delete -f examples/standard
clusterrolebinding.rbac.authorization.k8s.io "kube-state-metrics" deleted
clusterrole.rbac.authorization.k8s.io "kube-state-metrics" deleted
deployment.apps "kube-state-metrics" deleted
serviceaccount "kube-state-metrics" deleted
service "kube-state-metrics" deleted
builder@DESKTOP-JBA79RT:~/Workspaces/kube-state-metrics$ kubectl apply -f examples/standard
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
serviceaccount/kube-state-metrics created
service/kube-state-metrics created

Pro-tip: If you don’t first remove (kubectl delete -f) the existing, you’ll have errors (as some of this is already on your cluster). E.g.

builder@DESKTOP-JBA79RT:~/Workspaces/kube-state-metrics$ kubectl apply -f examples/standard
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics configured
clusterrole.rbac.authorization.k8s.io/kube-state-metrics configured
serviceaccount/kube-state-metrics configured
Error from server (Invalid): error when applying patch:
{"metadata":{"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"apps/v1\",\"kind\":\"Deployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"app.kubernetes.io/name\":\"kube-state-metrics\",\"app.kubernetes.io/version\":\"1.9.5\"},\"name\":\"kube-state-metrics\",\"namespace\":\"kube-system\"},\"spec\":{\"replicas\":1,\"selector\":{\"matchLabels\":{\"app.kubernetes.io/name\":\"kube-state-metrics\"}},\"template\":{\"metadata\":{\"labels\":{\"app.kubernetes.io/name\":\"kube-state-metrics\",\"app.kubernetes.io/version\":\"1.9.5\"}},\"spec\":{\"containers\":[{\"image\":\"quay.io/coreos/kube-state-metrics:v1.9.5\",\"livenessProbe\":{\"httpGet\":{\"path\":\"/healthz\",\"port\":8080},\"initialDelaySeconds\":5,\"timeoutSeconds\":5},\"name\":\"kube-state-metrics\",\"ports\":[{\"containerPort\":8080,\"name\":\"http-metrics\"},{\"containerPort\":8081,\"name\":\"telemetry\"}],\"readinessProbe\":{\"httpGet\":{\"path\":\"/\",\"port\":8081},\"initialDelaySeconds\":5,\"timeoutSeconds\":5},\"securityContext\":{\"runAsUser\":65534}}],\"nodeSelector\":{\"kubernetes.io/os\":\"linux\"},\"serviceAccountName\":\"kube-state-metrics\"}}}}\n"},"labels":{"app.kubernetes.io/name":"kube-state-metrics","app.kubernetes.io/version":"1.9.5","k8s-app":null}},"spec":{"selector":{"matchLabels":{"app.kubernetes.io/name":"kube-state-metrics","k8s-app":null}},"template":{"metadata":{"labels":{"app.kubernetes.io/name":"kube-state-metrics","app.kubernetes.io/version":"1.9.5","k8s-app":null}},"spec":{"$setElementOrder/containers":[{"name":"kube-state-metrics"}],"containers":[{"image":"quay.io/coreos/kube-state-metrics:v1.9.5","livenessProbe":{"httpGet":{"path":"/healthz","port":8080},"initialDelaySeconds":5,"timeoutSeconds":5},"name":"kube-state-metrics","readinessProbe":{"httpGet":{"path":"/","port":8081}},"securityContext":{"runAsUser":65534}}],"nodeSelector":{"kubernetes.io/os":"linux"}}}}}
to:
Resource: "apps/v1, Resource=deployments", GroupVersionKind: "apps/v1, Kind=Deployment"
Name: "kube-state-metrics", Namespace: "kube-system"
Object: &{map["apiVersion":"apps/v1" "kind":"Deployment" "metadata":map["annotations":map["deployment.kubernetes.io/revision":"1" "kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"apps/v1\",\"kind\":\"Deployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"k8s-app\":\"kube-state-metrics\"},\"name\":\"kube-state-metrics\",\"namespace\":\"kube-system\"},\"spec\":{\"replicas\":1,\"selector\":{\"matchLabels\":{\"k8s-app\":\"kube-state-metrics\"}},\"template\":{\"metadata\":{\"labels\":{\"k8s-app\":\"kube-state-metrics\"}},\"spec\":{\"containers\":[{\"image\":\"quay.io/coreos/kube-state-metrics:v1.7.2\",\"name\":\"kube-state-metrics\",\"ports\":[{\"containerPort\":8080,\"name\":\"http-metrics\"},{\"containerPort\":8081,\"name\":\"telemetry\"}],\"readinessProbe\":{\"httpGet\":{\"path\":\"/healthz\",\"port\":8080},\"initialDelaySeconds\":5,\"timeoutSeconds\":5}}],\"serviceAccountName\":\"kube-state-metrics\"}}}}\n"] "creationTimestamp":"2020-04-11T17:47:46Z" "generation":'\x01' "labels":map["k8s-app":"kube-state-metrics"] "name":"kube-state-metrics" "namespace":"kube-system" "resourceVersion":"149430" "selfLink":"/apis/apps/v1/namespaces/kube-system/deployments/kube-state-metrics" "uid":"46e8c97c-6dd1-408d-a7cb-6dc8329b65f2"] "spec":map["progressDeadlineSeconds":'\u0258' "replicas":'\x01' "revisionHistoryLimit":'\n' "selector":map["matchLabels":map["k8s-app":"kube-state-metrics"]] "strategy":map["rollingUpdate":map["maxSurge":"25%" "maxUnavailable":"25%"] "type":"RollingUpdate"] "template":map["metadata":map["creationTimestamp":<nil> "labels":map["k8s-app":"kube-state-metrics"]] "spec":map["containers":[map["image":"quay.io/coreos/kube-state-metrics:v1.7.2" "imagePullPolicy":"IfNotPresent" "name":"kube-state-metrics" "ports":[map["containerPort":'\u1f90' "name":"http-metrics" "protocol":"TCP"] map["containerPort":'\u1f91' "name":"telemetry" "protocol":"TCP"]] "readinessProbe":map["failureThreshold":'\x03' "httpGet":map["path":"/healthz" "port":'\u1f90' "scheme":"HTTP"] "initialDelaySeconds":'\x05' "periodSeconds":'\n' "successThreshold":'\x01' "timeoutSeconds":'\x05'] "resources":map[] "terminationMessagePath":"/dev/termination-log" "terminationMessagePolicy":"File"]] "dnsPolicy":"ClusterFirst" "restartPolicy":"Always" "schedulerName":"default-scheduler" "securityContext":map[] "serviceAccount":"kube-state-metrics" "serviceAccountName":"kube-state-metrics" "terminationGracePeriodSeconds":'\x1e']]] "status":map["availableReplicas":'\x01' "conditions":[map["lastTransitionTime":"2020-04-11T17:48:04Z" "lastUpdateTime":"2020-04-11T17:48:04Z" "message":"Deployment has minimum availability." "reason":"MinimumReplicasAvailable" "status":"True" "type":"Available"] map["lastTransitionTime":"2020-04-11T17:47:46Z" "lastUpdateTime":"2020-04-11T17:48:04Z" "message":"ReplicaSet \"kube-state-metrics-85cf88c7fb\" has successfully progressed." "reason":"NewReplicaSetAvailable" "status":"True" "type":"Progressing"]] "observedGeneration":'\x01' "readyReplicas":'\x01' "replicas":'\x01' "updatedReplicas":'\x01']]}
for: "examples/standard/deployment.yaml": Deployment.apps "kube-state-metrics" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/name":"kube-state-metrics"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
Error from server (Invalid): error when applying patch:
{"metadata":{"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"labels\":{\"app.kubernetes.io/name\":\"kube-state-metrics\",\"app.kubernetes.io/version\":\"1.9.5\"},\"name\":\"kube-state-metrics\",\"namespace\":\"kube-system\"},\"spec\":{\"clusterIP\":\"None\",\"ports\":[{\"name\":\"http-metrics\",\"port\":8080,\"targetPort\":\"http-metrics\"},{\"name\":\"telemetry\",\"port\":8081,\"targetPort\":\"telemetry\"}],\"selector\":{\"app.kubernetes.io/name\":\"kube-state-metrics\"}}}\n","prometheus.io/scrape":null},"labels":{"app.kubernetes.io/name":"kube-state-metrics","app.kubernetes.io/version":"1.9.5","k8s-app":null}},"spec":{"$setElementOrder/ports":[{"port":8080},{"port":8081}],"clusterIP":"None","ports":[{"port":8080,"protocol":null},{"port":8081,"protocol":null}],"selector":{"app.kubernetes.io/name":"kube-state-metrics","k8s-app":null}}}
to:
Resource: "/v1, Resource=services", GroupVersionKind: "/v1, Kind=Service"
Name: "kube-state-metrics", Namespace: "kube-system"
Object: &{map["apiVersion":"v1" "kind":"Service" "metadata":map["annotations":map["kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{\"prometheus.io/scrape\":\"true\"},\"labels\":{\"k8s-app\":\"kube-state-metrics\"},\"name\":\"kube-state-metrics\",\"namespace\":\"kube-system\"},\"spec\":{\"ports\":[{\"name\":\"http-metrics\",\"port\":8080,\"protocol\":\"TCP\",\"targetPort\":\"http-metrics\"},{\"name\":\"telemetry\",\"port\":8081,\"protocol\":\"TCP\",\"targetPort\":\"telemetry\"}],\"selector\":{\"k8s-app\":\"kube-state-metrics\"}}}\n" "prometheus.io/scrape":"true"] "creationTimestamp":"2020-04-11T17:47:47Z" "labels":map["k8s-app":"kube-state-metrics"] "name":"kube-state-metrics" "namespace":"kube-system" "resourceVersion":"149366" "selfLink":"/api/v1/namespaces/kube-system/services/kube-state-metrics" "uid":"94b7ceb7-a14c-424c-8a90-7acd858bf494"] "spec":map["clusterIP":"10.0.215.149" "ports":[map["name":"http-metrics" "port":'\u1f90' "protocol":"TCP" "targetPort":"http-metrics"] map["name":"telemetry" "port":'\u1f91' "protocol":"TCP" "targetPort":"telemetry"]] "selector":map["k8s-app":"kube-state-metrics"] "sessionAffinity":"None" "type":"ClusterIP"] "status":map["loadBalancer":map[]]]}
for: "examples/standard/service.yaml": Service "kube-state-metrics" is invalid: spec.clusterIP: Invalid value: "None": field is immutable

Then download the NR DS Yaml

curl -O https://download.newrelic.com/infrastructure_agent/integrations/kubernetes/newrelic-infrastructure-k8s-latest.yaml

Once downloaded, change that yaml to have your license key (should end in NRAL) and give the “CLUSTER_NAME” a name you’ll remember.

newrelic-infrastructure-k8s-latest.yaml

Once applied:

kubectl create -f newrelic-infrastructure-k8s-latest.yaml

We can log into NewRelic to check out some of our metrics.  This is a 3 node cluster by default and we can see the load already tracked:

Clicking on hosts gives us details on each host as well:

One detail in the tables that is really useful, that my developers are often seeking out of the Kubernetes dashboard, is per container CPU and Memory usage.  Going to processes we can look at those stats by process:

We can also click on “Kubernetes” to examine pods by node:

Let’s see how we can really see some changes.

First, let’s go ahead and limit our view to just the kubernetes-dashboard deployment. This will filter to the node(s) and pod(s) that were deployed with it:

Next, in my shell, let’s check if it’s the old 1.x dashboard or the newer 2.x:

builder@DESKTOP-JBA79RT:~/Workspaces/kube-state-metrics$ kubectl describe pod kubernetes-dashboard-7f7676f7b5-whdwc -n kube-system | grep Image
    Image: mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-beta8
    Image ID: docker-pullable://mcr.microsoft.com/oss/kubernetes/dashboard@sha256:85fbaa5c8fd7ffc5723965685d9467a89e2148206019d1a8979cec1f2d25faef

Which we can confirm is the 8443 port (not the old 9090):

builder@DESKTOP-JBA79RT:~/Workspaces/kube-state-metrics$ kubectl describe pod kubernetes-dashboard-7f7676f7b5-whdwc -n kube-system | grep Port:
    Port: 8443/TCP
    Host Port: 0/TCP

I’ll copy over my kubeconfig (which ill use to login) and then launch a proxy:

builder@DESKTOP-JBA79RT:~$ cp ~/.kube/config /mnt/c/Users/isaac/Downloads/aksdev5config

builder@DESKTOP-JBA79RT:~/Workspaces/kube-state-metrics$ kubectl port-forward kubernetes-dashboard-7f7676f7b5-whdwc -n kube-system 8443:8443
Forwarding from 127.0.0.1:8443 -> 8443
Forwarding from [::1]:8443 -> 8443

I usually use firefox for this (since Chrome hates self signed websites);

Use kubeconfig and pick the file you copied:

Let’s now browse to our Kubernetes-Dashboard deployment and scale up to something high - like a 100 replicas - to see how it impacts ourcluster:

While eventually AKS caught wind and scaled us back down to one replica for the dashboard, we see that when the cluster attempted to fulfill our scaling request, it did indeed scale out the cluster by two more nodes.   The events shows it at the top:

And if we click on the event, we can get more details:

Let’s try another change.. Let’s say we want to save some money and scale our cluster all the way down to 1 node:

builder@DESKTOP-JBA79RT:/tmp$ az aks scale -g idjaks05rg -n idjaks05 --node-count 1
 - Running ..

We will now see a new event (light blue) showing agents disconnecting:

builder@DESKTOP-JBA79RT:/tmp$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-nodepool1-25027797-vmss000000 Ready agent 42h v1.17.3
aks-nodepool1-25027797-vmss000001 Ready agent 42h v1.17.3
aks-nodepool1-25027797-vmss000002 Ready agent 42h v1.17.3
aks-nodepool1-25027797-vmss000003 NotReady agent 18m v1.17.3
aks-nodepool1-25027797-vmss000004 NotReady agent 18m v1.17.3

Unfortunately, AKS won’t let us scale below our minimum from the command line:

builder@DESKTOP-JBA79RT:/tmp$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-nodepool1-25027797-vmss000000 Ready agent 43h v1.17.3
aks-nodepool1-25027797-vmss000001 Ready agent 43h v1.17.3
aks-nodepool1-25027797-vmss000002 Ready agent 43h v1.17.3

However, we can force a change to manual to reduce to less in the portal.

Dropping to two nodes:

builder@DESKTOP-JBA79RT:/tmp$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-nodepool1-25027797-vmss000000 Ready agent 43h v1.17.3
aks-nodepool1-25027797-vmss000001 Ready agent 43h v1.17.3

We can see that event captured as well:

We can check the version of the NR Agent

builder@DESKTOP-JBA79RT:/tmp$ kubectl get pods
NAME READY STATUS RESTARTS AGE
newrelic-infra-m89l5 1/1 Running 0 3h49m
newrelic-infra-z5dmr 1/1 Running 0 3h49m
builder@DESKTOP-JBA79RT:/tmp$ kubectl exec -it newrelic-infra-z5dmr /bin/sh
/ # newrelic-infra --version
New Relic Infrastructure Agent version: 1.11.20

And see we are up to date: https://docs.newrelic.com/docs/release-notes/infrastructure-release-notes/infrastructure-agent-release-notes

Note: I could not figure out why (perhaps because I’m using the latest 1.17 preview), but i was unable to look at container details specifically.

NewRelic also has APM via NewRelic ONE:

At the bottom are pods in latest deployments:

We can pipe logs into NewRelic using logstash, fluentbit, etc.

Setting up Fluent Logging to NewRelic

Following (to a degree) their guide: https://docs.newrelic.com/docs/logs/enable-logs/enable-logs/fluent-bit-plugin-logs#fluentbit-plugin

builder@DESKTOP-JBA79RT:~/Workspaces$ git clone https://github.com/newrelic/newrelic-fluent-bit-output.git
Cloning into 'newrelic-fluent-bit-output'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 1118 (delta 0), reused 1 (delta 0), pack-reused 1113
Receiving objects: 100% (1118/1118), 2.35 MiB | 9.53 MiB/s, done.
Resolving deltas: 100% (410/410), done.
builder@DESKTOP-JBA79RT:~/Workspaces$ cd newrelic-fluent-bit-output/

We need go binaries first

builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ sudo apt-get update
[sudo] password for builder:
Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Hit:2 https://packages.microsoft.com/repos/azure-cli bionic InRelease
Get:3 https://packages.microsoft.com/ubuntu/18.04/prod bionic InRelease [4003 B]
Hit:4 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:5 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [692 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:7 https://packages.microsoft.com/ubuntu/18.04/prod bionic/main amd64 Packages [105 kB]
Get:8 http://security.ubuntu.com/ubuntu bionic-security/main Translation-en [221 kB]
Get:9 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [34.3 kB]
Get:10 http://security.ubuntu.com/ubuntu bionic-security/restricted Translation-en [8924 B]
Get:11 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [657 kB]
Get:12 http://security.ubuntu.com/ubuntu bionic-security/universe Translation-en [218 kB]
Get:13 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [7176 B]
Get:14 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:15 http://security.ubuntu.com/ubuntu bionic-security/multiverse Translation-en [2764 B]
Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [914 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates/main Translation-en [314 kB]
Get:18 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [43.9 kB]
Get:19 http://archive.ubuntu.com/ubuntu bionic-updates/restricted Translation-en [11.0 kB]
Get:20 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1065 kB]
Get:21 http://archive.ubuntu.com/ubuntu bionic-updates/universe Translation-en [330 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [10.8 kB]
Get:23 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse Translation-en [4728 B]
Fetched 4896 kB in 3s (1780 kB/s)
Reading package lists... Done
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ sudo apt-get -y upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following packages have been kept back:
  netplan.io
The following packages will be upgraded:
  apport azure-cli containerd gcc-8-base libatomic1 libcc1-0 libgcc1 libgomp1 libitm1 liblsan0 libmpx2 libquadmath0 libstdc++6 libtsan0 linux-libc-dev python3-apport
  python3-problem-report
17 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
Need to get 70.6 MB of archives.
After this operation, 15.5 MB of additional disk space will be used.
Get:1 https://packages.microsoft.com/repos/azure-cli bionic/main amd64 azure-cli all 2.3.1-1~bionic [46.5 MB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libquadmath0 amd64 8.4.0-1ubuntu1~18.04 [134 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libitm1 amd64 8.4.0-1ubuntu1~18.04 [27.9 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 gcc-8-base amd64 8.4.0-1ubuntu1~18.04 [18.7 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libstdc++6 amd64 8.4.0-1ubuntu1~18.04 [400 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libmpx2 amd64 8.4.0-1ubuntu1~18.04 [11.6 kB]
Get:7 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 liblsan0 amd64 8.4.0-1ubuntu1~18.04 [133 kB]
Get:8 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libtsan0 amd64 8.4.0-1ubuntu1~18.04 [288 kB]
Get:9 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libcc1-0 amd64 8.4.0-1ubuntu1~18.04 [39.4 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libatomic1 amd64 8.4.0-1ubuntu1~18.04 [9192 B]
Get:11 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libgomp1 amd64 8.4.0-1ubuntu1~18.04 [76.5 kB]
Get:12 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libgcc1 amd64 1:8.4.0-1ubuntu1~18.04 [40.6 kB]
Get:13 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 python3-problem-report all 2.20.9-0ubuntu7.14 [10.7 kB]
Get:14 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 python3-apport all 2.20.9-0ubuntu7.14 [82.1 kB]
Get:15 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 apport all 2.20.9-0ubuntu7.14 [124 kB]
Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 containerd amd64 1.3.3-0ubuntu1~18.04.2 [21.7 MB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 linux-libc-dev amd64 4.15.0-96.97 [1021 kB]
Fetched 70.6 MB in 5s (15.1 MB/s)
(Reading database ... 77910 files and directories currently installed.)
Preparing to unpack .../libquadmath0_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libquadmath0:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../libitm1_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libitm1:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../gcc-8-base_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking gcc-8-base:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Setting up gcc-8-base:amd64 (8.4.0-1ubuntu1~18.04) ...
(Reading database ... 77910 files and directories currently installed.)
Preparing to unpack .../libstdc++6_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libstdc++6:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Setting up libstdc++6:amd64 (8.4.0-1ubuntu1~18.04) ...
(Reading database ... 77910 files and directories currently installed.)
Preparing to unpack .../0-libmpx2_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libmpx2:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../1-liblsan0_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking liblsan0:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../2-libtsan0_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libtsan0:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../3-libcc1-0_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libcc1-0:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../4-libatomic1_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libatomic1:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../5-libgomp1_8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libgomp1:amd64 (8.4.0-1ubuntu1~18.04) over (8.3.0-26ubuntu1~18.04) ...
Preparing to unpack .../6-libgcc1_1%3a8.4.0-1ubuntu1~18.04_amd64.deb ...
Unpacking libgcc1:amd64 (1:8.4.0-1ubuntu1~18.04) over (1:8.3.0-26ubuntu1~18.04) ...
Setting up libgcc1:amd64 (1:8.4.0-1ubuntu1~18.04) ...
(Reading database ... 77910 files and directories currently installed.)
Preparing to unpack .../0-python3-problem-report_2.20.9-0ubuntu7.14_all.deb ...
Unpacking python3-problem-report (2.20.9-0ubuntu7.14) over (2.20.9-0ubuntu7.12) ...
Preparing to unpack .../1-python3-apport_2.20.9-0ubuntu7.14_all.deb ...
Unpacking python3-apport (2.20.9-0ubuntu7.14) over (2.20.9-0ubuntu7.12) ...
Preparing to unpack .../2-apport_2.20.9-0ubuntu7.14_all.deb ...
invoke-rc.d: could not determine current runlevel
 * Stopping automatic crash report generation: apport [OK]
Unpacking apport (2.20.9-0ubuntu7.14) over (2.20.9-0ubuntu7.12) ...
Preparing to unpack .../3-containerd_1.3.3-0ubuntu1~18.04.2_amd64.deb ...
Unpacking containerd (1.3.3-0ubuntu1~18.04.2) over (1.3.3-0ubuntu1~18.04.1) ...
Preparing to unpack .../4-linux-libc-dev_4.15.0-96.97_amd64.deb ...
Unpacking linux-libc-dev:amd64 (4.15.0-96.97) over (4.15.0-91.92) ...
Preparing to unpack .../5-azure-cli_2.3.1-1~bionic_all.deb ...
Unpacking azure-cli (2.3.1-1~bionic) over (2.2.0-1~bionic) ...
Setting up libquadmath0:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libgomp1:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libatomic1:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libcc1-0:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up azure-cli (2.3.1-1~bionic) ...
Setting up libtsan0:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up linux-libc-dev:amd64 (4.15.0-96.97) ...
Setting up containerd (1.3.3-0ubuntu1~18.04.2) ...
Setting up liblsan0:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up python3-problem-report (2.20.9-0ubuntu7.14) ...
Setting up libmpx2:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libitm1:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up python3-apport (2.20.9-0ubuntu7.14) ...
Setting up apport (2.20.9-0ubuntu7.14) ...
invoke-rc.d: could not determine current runlevel
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...
Processing triggers for systemd (237-3ubuntu10.39) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...

Then we can set up go (so we can “make”). While their guide suggests go 1.11, the git repo has moved to 1.12.

$ cd /tmp
$ wget https://dl.google.com/go/go1.12.linux-amd64.tar.gz
$ umask 0002
$ sudo tar -xvf go1.12.linux-amd64.tar.gz
$ sudo mv go /usr/local
$ vi ~/.bashrc

builder@DESKTOP-JBA79RT:/tmp$ cat ~/.bashrc | tail -n3
export GOROOT=/usr/local/go
export GOPATH=$HOME/go
export PATH=$GOPATH/bin:$GOROOT/bin:$PATH

$ source ~/.bashrc

We need a fluentbit container with newrelic on there.  We can build it and make our own, but the documentation tells us:

This handy message with no actual container name or link given :|

However, nowhere was it listed where.  Browsing dockerhub, i found the image: https://hub.docker.com/r/newrelic/newrelic-fluentbit-output … but then we need one more piece of info - the path to the build sharedobject (so). I figured it out from the dockerfile here: https://github.com/newrelic/newrelic-fluent-bit-output/blob/master/Dockerfile#L15

We now have enough to setup fluent with newrelic.

The first four parts can be done right from the fluentbit documentation:

https://docs.fluentbit.io/manual/installation/kubernetes

builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl create namespace logging
namespace/logging created
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-service-accou
nt.yaml
serviceaccount/fluent-bit created
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role.yaml
clusterrole.rbac.authorization.k8s.io/fluent-bit-read created
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role-binding.yaml
clusterrolebinding.rbac.authorization.k8s.io/fluent-bit-read created
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$

Next we want to wget (or curl -O) the configmap and ds files:

$ wget https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-configmap.yaml
$ cp fluent-bit-configmap.yaml fluent-bit-configmap.yaml.orig
$ wget https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-ds.yaml
$ cp fluent-bit-ds.yaml fluent-bit-ds.yaml.orig

The configmap will need to add our license key. We’ll also add a key for the host and service:

builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ diff fluent-bit-configmap.yaml fluent-bit-configmap.yaml.orig
11,14d10
< plugins.conf: |
< [PLUGINS]
< Path /fluent-bit/bin/out_newrelic.so
<
21d16
< Plugins_File plugins.conf
53,54d47
< Add hostname ${FLUENT_NEWRELIC_HOST}
< Add service_name ${FLUENT_NEWRELIC_SERVICE}
58c51
< Name newrelic
---
> Name es
60c53,54
< licenseKey c8xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxNRAL
---
> Host ${FLUENT_ELASTICSEARCH_HOST}
> Port ${FLUENT_ELASTICSEARCH_PORT}

Similarly we’ll change the daemonset yaml to point to the newrelic image and add the variables we referenced above:

builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ diff fluent-bit-ds.yaml fluent-bit-ds.yaml.orig
1c1
< apiVersion: apps/v1
---
> apiVersion: extensions/v1beta1
7c7,9
< app.kubernetes.io/name: fluent-bit
---
> k8s-app: fluent-bit-logging
> version: v1
> kubernetes.io/cluster-service: "true"
9,11d10
< selector:
< matchLabels:
< app.kubernetes.io/name: fluent-bit
15c14,16
< app.kubernetes.io/name: fluent-bit
---
> k8s-app: fluent-bit-logging
> version: v1
> kubernetes.io/cluster-service: "true"
22,23c23,24
< - name: nr-fluent-bit
< image: newrelic/newrelic-fluentbit-output:1.1.5
---
> - name: fluent-bit
> image: fluent/fluent-bit:1.3.11
28,31c29,32
< - name: FLUENT_NEWRELIC_HOST
< value: "aks"
< - name: FLUENT_NEWRELIC_SERVICE
< value: "dev5"
---
> - name: FLUENT_ELASTICSEARCH_HOST
> value: "elasticsearch"
> - name: FLUENT_ELASTICSEARCH_PORT
> value: "9200"
60d60
<

For those that don’t dig on diff:

configmap changes (modified on right)
daemonset changes (modified on right)

Now apply

builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl apply -f fluent-bit-configmap.yaml
configmap/fluent-bit-config created
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl apply -f fluent-bit-ds.yaml
daemonset.apps/fluent-bit created

Lastly, we can now check that it’s working:

builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl get pods -n logging
NAME READY STATUS RESTARTS AGE
fluent-bit-tztbx 1/1 Running 0 8m36s
fluent-bit-z5g7g 1/1 Running 0 8m36s
builder@DESKTOP-JBA79RT:~/Workspaces/newrelic-fluent-bit-output$ kubectl logs fluent-bit-tztbx -n logging
Fluent Bit v1.0.3
Copyright (C) Treasure Data

[2020/04/13 13:15:38] [info] [storage] initializing...
[2020/04/13 13:15:38] [info] [storage] in-memory
[2020/04/13 13:15:38] [info] [storage] normal synchronization mode, checksum disabled
[2020/04/13 13:15:38] [info] [engine] started (pid=1)
[2020/04/13 13:15:38] [info] [filter_kube] https=1 host=kubernetes.default.svc port=443
[2020/04/13 13:15:38] [info] [filter_kube] local POD info OK
[2020/04/13 13:15:38] [info] [filter_kube] testing connectivity with API server...
[2020/04/13 13:15:43] [info] [filter_kube] API server connectivity OK
[2020/04/13 13:15:43] [info] [http_server] listen iface=0.0.0.0 tcp_port=2020

Now almost immediately we see results in the NewRelicONE Logs UI:

And this means we can query any kind of standard logging we would expect, such as all logs from a particular container:

If we wanted to, we could go a step further and capture logs from the nodes themselves (following this guide).

Though, because AKS uses VMSS, we can also use NewRelic’s integration with Azure. (granted i followed the wizard to create an SP and add as a reader to my subscription.  there were no issues, so i’m not adding that here)

This allows us to do things like directly track VMSS events via NewRelic:

And we won’t get too deep into alerts. But one can create notification channels to subscribe to alert policies.  For instance, to set an alert on the default Kubernetes policy, i can just subscribe my email in a Notification Channel.

And we can then create alerts, for instance on logs in the NewRelicOne side:

Example NewRelic Email

As an aside, we can see Metrics under Monitoring from Azure itself:

We can track CPUs, for instance.

If we enable logging to an Azure Workspaces, we can see logs there as well:

Summary

It’s hard to figure out what is and is not included with our Demo environment.  

For instance, Infrastructure monitoring is $7.20/mo for essentials and $14.40/mo for “Pro” with 12k CUs.  What’s a CU? Well it’s some unit.  How much are more CUs?  You can ask us…

Then for Logging, it’s a bit more straightforward with plansbetween $55/mo and $75/mo.

Log Pricing as of Apr-13-2020

They have a calculator, which gives some examples what that means:

  • 8 Days Retention: $55 X 10 GB Daily = $550 per month or $6,600 per year
  • 15 Days Retention: $65 X 10 GB Daily = $650 per month or $7,800 per year
  • 30 Days Retention: $75 X 10 GB Daily = $750 per month or $9,000 per year

I’m not sure I am doing this right, but this is close to what Azure will charge for similar tracking:

However, with Azure Monitoring (which includes logs) you pay per alerting rule.

I think NewRelic has a very compelling full stack offering.  As they are a seasoned player in the space, there are great integrations to expand on it, such as BigPandaand PagerDutyfor on-call support.

aks k8s newrelic apm getting-started

Have something to add? Feedback? You can use the feedback form

Isaac Johnson

Isaac Johnson

Cloud Solutions Architect

Isaac is a CSA and DevOps engineer who focuses on cloud migrations and devops processes. He also is a dad to three wonderful daughters (hence the references to Princess King sprinkled throughout the blog).

Theme built by C.S. Rhymes