Published: Jul 15, 2025 by Isaac Johnson
Today we are going to explore setting up a full Grafana OS stack including Alloy for OpenTelemetry (OTLP), Tempo for Traces, Prometheus for Metrics, Loki for Logs and lastly Grafana for reviewing and reporting on all of those.
My goal for all this is to really explore what kind of telemetry data we can pull from the Gemini CLI. Can we monitor usage? Logs? Tokens? What does tracing mean when it comes to a tool like Gemini CLI?
We’ll explore all that and more below. Let’s start with Alloy first, the Grafana OS OpenTelemetry (OTEL) collector.
Grafana Alloy
As we use Kubernetes here, we’ll follow the pattern of setting this all up with Helm charts in our cluster. It should be noted that one can use docker just as well if doing this outside of Kubernetes.
Let’s add the Grafana helm repo and update
$ helm repo add grafana https://grafana.github.io/helm-charts
"grafana" already exists with the same configuration, skipping
$ helm repo update
Grafana
I already have a namespace, so I’ll first just add Grafana there
$ helm install grafana -n grafana grafana/grafana
NAME: grafana
LAST DEPLOYED: Thu Jul 10 19:53:19 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 1
NOTES:
1. Get your 'admin' user password by running:
kubectl get secret --namespace grafana grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:
grafana.grafana.svc.cluster.local
Get the Grafana URL to visit by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 3000
3. Login with the password from step 1 and the username: admin
#################################################################################
###### WARNING: Persistence is disabled!!! You will lose your data when #####
###### the Grafana pod is terminated. #####
#################################################################################
Alloy
I’ll upgrade my Alloy with the same values
$ cat grafana.values.yaml
alloy:
configMap:
create: false
key: config.alloy
name: alloy-config
extraPorts:
- name: otelgrpc
port: 4317
protocol: TCP
targetPort: 4317
- name: otelhttp
port: 4318
protocol: TCP
targetPort: 4318
- name: zipkinhttp
port: 9411
protocol: TCP
targetPort: 9411
$ helm upgrade -n grafana alloy -f grafana.values.yaml grafana/alloy
Release "alloy" has been upgraded. Happy Helming!
NAME: alloy
LAST DEPLOYED: Thu Jul 10 19:57:09 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 3
TEST SUITE: None
NOTES:
Welcome to Grafana Alloy!
I realized my last Alloy deployment was in a bad state so I did a helm delete
then changed the values to create the ConfigMap (cm)
$ cat grafana.values.yaml
alloy:
configMap:
create: true
key: config.alloy
name: alloy-config
extraPorts:
- name: otelgrpc
port: 4317
protocol: TCP
targetPort: 4317
- name: otelhttp
port: 4318
protocol: TCP
targetPort: 4318
- name: zipkinhttp
port: 9411
protocol: TCP
targetPort: 9411
And installed
$ helm install -n grafana alloy -f grafana.values.yaml grafana/alloy
NAME: alloy
LAST DEPLOYED: Thu Jul 10 20:10:05 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Welcome to Grafana Alloy!
This time it worked
$ kubectl get po -n grafana
NAME READY STATUS RESTARTS AGE
alloy-2h559 2/2 Running 0 5m2s
alloy-6cqcw 2/2 Running 0 5m3s
alloy-bbqvt 2/2 Running 0 5m2s
alloy-l467k 2/2 Running 0 5m2s
grafana-7b9777bf67-lg7xp 1/1 Running 0 21m
Prometheus
Grafana is used to visualize and alert and Alloy is an OTEL collector but we need somewhere in-between for Metrics and Logs.
For Metrics, we’ll use Prometheus.
I can add the chart and update, if needed
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" already exists with the same configuration, skipping
$ helm repo update
Most people put Prometheus at the root of their cluster (default
namespace), but for this, I’m leaving in the Grafana namespace
$ helm install prometheus -n grafana prometheus-community/prometheus
NAME: prometheus
LAST DEPLOYED: Fri Jul 11 06:18:56 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-server.grafana.svc.cluster.local
Get the Prometheus server URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app.kubernetes.io/name=prometheus,app.kubernetes.io/instance=prometheus" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 9090
The Prometheus alertmanager can be accessed via port 9093 on the following DNS name from within your cluster:
prometheus-alertmanager.grafana.svc.cluster.local
Get the Alertmanager URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app.kubernetes.io/name=alertmanager,app.kubernetes.io/instance=prometheus" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 9093
#################################################################################
###### WARNING: Pod Security Policy has been disabled by default since #####
###### it deprecated after k8s 1.25+. use #####
###### (index .Values "prometheus-node-exporter" "rbac" #####
###### . "pspEnabled") with (index .Values #####
###### "prometheus-node-exporter" "rbac" "pspAnnotations") #####
###### in case you still need it. #####
#################################################################################
The Prometheus PushGateway can be accessed via port 9091 on the following DNS name from within your cluster:
prometheus-prometheus-pushgateway.grafana.svc.cluster.local
Get the PushGateway URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app=prometheus-pushgateway,component=pushgateway" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 9091
For more information on running Prometheus, visit:
https://prometheus.io/
To reach this externally, we can expose a nodeport (though I won’t neccessarily use it with the OTel Collector, but will for other things)
$ kubectl expose service prometheus-server -n grafana --type=NodePort --target-port=9090 --
name=prometheus-server-ext
service/prometheus-server-ext exposed
We can now see Alloy, Grafana and Prometheus ports
$ kubectl get svc -n grafana
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alloy ClusterIP 10.43.154.188 <none> 12345/TCP,4317/TCP,4318/TCP,9411/TCP 10h
grafana ClusterIP 10.43.37.230 <none> 80/TCP 10h
prometheus-alertmanager ClusterIP 10.43.44.14 <none> 9093/TCP 2m27s
prometheus-alertmanager-headless ClusterIP None <none> 9093/TCP 2m27s
prometheus-kube-state-metrics ClusterIP 10.43.62.62 <none> 8080/TCP 2m27s
prometheus-prometheus-node-exporter ClusterIP 10.43.32.2 <none> 9100/TCP 2m27s
prometheus-prometheus-pushgateway ClusterIP 10.43.204.146 <none> 9091/TCP 2m27s
prometheus-server ClusterIP 10.43.112.167 <none> 80/TCP 2m27s
prometheus-server-ext NodePort 10.43.128.3 <none> 80:31182/TCP 48s
Loki
Lastly, let’s add Loki for Logs.
Logs can be a big ask so Loki can be installed as a monolith, microservice or scalable setup. This is just about how the subservices of Loki are run (Monolith is all-in-one and good for small setups, scalable is really just a scalable monolith, and microservice breaks each component out into their own independent service).
I’ll use Monolith for my use case. We can use MinIO for storage. If you want a guide on setting that up on a NAS you can click the link to a blog post on that.
That said, by default, when we enable MinIO in the helm values below, it adds it’s own one pod minio instance.
I’ll create a values.yaml - this enables MinIO but does not configure it
loki:
commonConfig:
replication_factor: 3
schemaConfig:
configs:
- from: "2024-04-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
pattern_ingester:
enabled: true
limits_config:
allow_structured_metadata: true
volume_enabled: true
ruler:
enable_api: true
minio:
enabled: true
deploymentMode: SingleBinary
singleBinary:
replicas: 3
# Zero out replica counts of other deployment modes
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0
ingester:
replicas: 0
querier:
replicas: 0
queryFrontend:
replicas: 0
queryScheduler:
replicas: 0
distributor:
replicas: 0
compactor:
replicas: 0
indexGateway:
replicas: 0
bloomCompactor:
replicas: 0
bloomGateway:
replicas: 0
We can now install
$ helm install loki -n grafana grafana/loki -f loki_values.yaml
NAME: loki
LAST DEPLOYED: Fri Jul 11 06:33:43 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 1
NOTES:
***********************************************************************
Welcome to Grafana Loki
Chart version: 6.31.0
Chart Name: loki
Loki version: 3.5.0
***********************************************************************
** Please be patient while the chart is being deployed **
Tip:
Watch the deployment status using the command: kubectl get pods -w --namespace grafana
If pods are taking too long to schedule make sure pod affinity can be fulfilled in the current cluster.
***********************************************************************
Installed components:
***********************************************************************
* loki
Loki has been deployed as a single binary.
This means a single pod is handling reads and writes. You can scale that pod vertically by adding more CPU and memory resources.
***********************************************************************
Sending logs to Loki
***********************************************************************
Loki has been configured with a gateway (nginx) to support reads and writes from a single component.
You can send logs from inside the cluster using the cluster DNS:
http://loki-gateway.grafana.svc.cluster.local/loki/api/v1/push
You can test to send data from outside the cluster by port-forwarding the gateway to your local machine:
kubectl port-forward --namespace grafana svc/loki-gateway 3100:80 &
And then using http://127.0.0.1:3100/loki/api/v1/push URL as shown below:
---
curl -H "Content-Type: application/json" -XPOST -s "http://127.0.0.1:3100/loki/api/v1/push" \
--data-raw "{\"streams\": [{\"stream\": {\"job\": \"test\"}, \"values\": [[\"$(date +%s)000000000\", \"fizzbuzz\"]]}]}" \
-H X-Scope-OrgId:foo
---
Then verify that Loki did receive the data using the following command:
---
curl "http://127.0.0.1:3100/loki/api/v1/query_range" --data-urlencode 'query={job="test"}' -H X-Scope-OrgId:foo | jq .data.result
---
***********************************************************************
Connecting Grafana to Loki
***********************************************************************
If Grafana operates within the cluster, you'll set up a new Loki datasource by utilizing the following URL:
http://loki-gateway.grafana.svc.cluster.local/
***********************************************************************
Multi-tenancy
***********************************************************************
Loki is configured with auth enabled (multi-tenancy) and expects tenant headers (`X-Scope-OrgID`) to be set for all API calls.
You must configure Grafana's Loki datasource using the `HTTP Headers` section with the `X-Scope-OrgID` to target a specific tenant.
For each tenant, you can create a different datasource.
The agent of your choice must also be configured to propagate this header.
For example, when using Promtail you can use the `tenant` stage. https://grafana.com/docs/loki/latest/send-data/promtail/stages/tenant/
When not provided with the `X-Scope-OrgID` while auth is enabled, Loki will reject reads and writes with a 404 status code `no org id`.
You can also use a reverse proxy, to automatically add the `X-Scope-OrgID` header as suggested by https://grafana.com/docs/loki/latest/operations/authentication/
For more information, read our documentation about multi-tenancy: https://grafana.com/docs/loki/latest/operations/multi-tenancy/
> When using curl you can pass `X-Scope-OrgId` header using `-H X-Scope-OrgId:foo` option, where foo can be replaced with the tenant of your choice.
Let’s just do a quick test as it suggests
$ curl “http://loki-gateway.grafana.svc.cluster.local/loki/api/v1/query_range” –data-urlencode ‘query={job=”test”}’ -
I’ll do as it suggests. In one window
$ kubectl port-forward --namespace grafana svc/loki-gateway 3100:80
Forwarding from 127.0.0.1:3100 -> 8080
Forwarding from [::1]:3100 -> 8080
Handling connection for 3100
In another
$ curl -H "Content-Type: application/json" -XPOST -s "http://127.0.0.1:3100/loki/api/v1/push" \
> --data-raw "{\"streams\": [{\"stream\": {\"job\": \"test\"}, \"values\": [[\"$(date +%s)000000000\", \"fizzbuzz\"]]}]}" \
> -H X-Scope-OrgId:foo
Then ask for that back
$ curl "http://127.0.0.1:3100/loki/api/v1/query_range" --data-urlencode 'query={job="test"}' -H X-Scope-OrgId:foo | jq .data.result
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2853 0 2825 100 28 18708 185 --:--:-- --:--:-- --:--:-- 18769
[
{
"stream": {
"detected_level": "unknown",
"job": "test",
"service_name": "test"
},
"values": [
[
"1752234158000000000",
"fizzbuzz"
]
]
}
]
Connecting Loki and Prometheus to Grafana
To view our data, we need to first login into Grafana. The admin password can be retrieved from a k8s secret
$ kubectl get secret -n grafana grafana -o jsonpath="{.data.admin-password}" | base64 --dec
ode
AAdfsdfasdASasdsfasASDASdasa
To make it easier, I’ll fire up an ext nodeport for Grafana as I did for Prometheus
$ kubectl expose service grafana -n grafana --type=NodePort --target-port=3000 --name=grafana-ex
service/grafana-ex exposed
$ kubectl get svc grafana-ex -n grafana
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana-ex NodePort 10.43.32.173 <none> 80:30168/TCP 32s
So now I can use port 30168
to get to Grafana without needing to leave out a port-forward command. I can use any node in the cluster with that port to get there
I can set up Prometheus and try and set up Loki using the UI
At this point I’m a bit stumped why Loki, even on node-port, is rejecting connections
$ curl "http://192.168.1.34:32151/loki/api/v1/query_range" --data-urlenco
de 'query={job="test"}' -H X-Scope-OrgId:foo
curl: (7) Failed to connect to 192.168.1.34 port 32151: Connection refused
It would see a service-to-service setup was not going to cut it. I did some debugging on the side and came up with an ext
NodePort service that routes to the pods direct
$ cat loki-gateway-ext2
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: gateway
app.kubernetes.io/instance: loki
name: loki-gateway-ext2
namespace: grafana
spec:
ports:
- name: http-metrics
port: 80
protocol: TCP
targetPort: http-metrics
selector:
app.kubernetes.io/component: gateway
app.kubernetes.io/instance: loki
app.kubernetes.io/name: loki
type: NodePort
$ kubectl apply -f ./loki-gateway-ext2
service/loki-gateway-ext2 created
$ kubectl get svc loki-gateway-ext2 -n grafana
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
loki-gateway-ext2 NodePort 10.43.53.51 <none> 80:30392/TCP 14s
This worked in a local test
$ curl "http://192.168.1.34:30392/loki/api/v1/query_range" --data-urlenco
de 'query={job="test"}' -H X-Scope-OrgId:foo
{"status":"success","data":{"resultType":"streams","result":[{"stream":{"detected_level":"unknown","job":"test","service_name":"test"},"values":[["1752234158000000000","fizzbuzz"]]}],"stats":{"summary":{"bytesProcessedPerSecond":509,"linesProcessedPerSecond":31,"totalBytesProcessed":32,"totalLinesProcessed":2,"execTime":0.062778,"queueTime":0.00424,"subqueries":0,"totalEntriesReturned":1,"splits":5,"shards":5,"totalPostFilterLines":2,"totalStructuredMetadataBytesProcessed":16},"querier":{"store":{"totalChunksRef":0,"totalChunksDownloaded":0,"chunksDownloadTime":0,"queryReferencedStructuredMetadata":false,"chunk":{"headChunkBytes":0,"headChunkLines":0,"decompressedBytes":0,"decompressedLines":0,"compressedBytes":0,"totalDuplicates":1,"postFilterLines":0,"headChunkStructuredMetadataBytes":0,"decompressedStructuredMetadataBytes":0},"chunkRefsFetchTime":0,"congestionControlLatency":0,"pipelineWrapperFilteredLines":0}},"ingester":{"totalReached":13,"totalChunksMatched":2,"totalBatches":12,"totalLinesSent":2,"store":{"totalChunksRef":0,"totalChunksDownloaded":0,"chunksDownloadTime":0,"queryReferencedStructuredMetadata":false,"chunk":{"headChunkBytes":32,"headChunkLines":2,"decompressedBytes":0,"decompressedLines":0,"compressedBytes":0,"totalDuplicates":0,"postFilterLines":2,"headChunkStructuredMetadataBytes":16,"decompressedStructuredMetadataBytes":0},"chunkRefsFetchTime":2643929,"congestionControlLatency":0,"pipelineWrapperFilteredLines":0}},"cache":{"chunk":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0},"index":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0},"result":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0},"statsResult":{"entriesFound":3,"entriesRequested":3,"entriesStored":1,"bytesReceived":522,"bytesSent":0,"requests":4,"downloadTime":3933691,"queryLengthServed":1382000000000},"volumeResult":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0},"seriesResult":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0},"labelResult":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0},"instantMetricResult":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0,"queryLengthServed":0}},"index":{"totalChunks":0,"postFilterChunks":0,"shardsDuration":0,"usedBloomFilters":false}}}}
To add this in Grafana, be aware we have to add an org as well (just as we did with that test).
My goal is to sort this out with Loki and Prometheus. Let’s use Gemini CLI for this
It did as I suggested and looked up the local k8s services
It had figured out Loki, but had a placeholder for metrics. So I asked it to try a bit harder
It needed me to be a bit more explicit on the ask
I was surprised how thrown it got by that last ask… it kept chugging for a while
I finally stopped it after 3.5 minutes at that last step
This actually looks rough
$ cat alloy.config
logging {
level = "info"
format = "logfmt"
}
// Loki configuration: Collect logs from Kubernetes pods and send them to the local Loki instance.
loki.source.kubernetes "pods" {
forward_to = [loki.write.default.receiver]
}
loki.write "default" {
endpoint {
url = "http://loki.grafana.svc.cluster.local:3100/loki/api/v1/push"
}
}
// Prometheus configuration: Scrape metrics from the local Prometheus server instance.
prometheus.scrape "default" {
targets = [
{"__address__" = "prometheus-server.grafana.svc.cluster.local:80", "job" = "prometheus"},
]
forward_to = [prometheus.remote_write.default.receiver]
}
// Define where to send the scraped Prometheus metrics.
// TODO: Replace the placeholder URL with your actual metrics endpoint.
prometheus.remote_write "default" {
endpoint {
url = "http://prometheus-server.grafana.svc.cluster.local:80/api/v1/write"
}
}
I’m going to pitch something similar to Claude Code
using “kubectl get svc -n grafana” to find the loki and prometheus services, please suggest a reasonable grafana alloy configuration │ │ (alloy.config) that would listen to OTLP and send metrics to Prometheus and logs to the Loki service. My goal is to listen to OTLP for metrics │ │ logs and traces and send metrics and logs through to prometheus and loki to view in Grafana.
Like Gemini CLI, Claude asks for some permissions to run kubectl commands
Initially it wanted to merge in new configs
But I told it that the local file is garbage and to overwrite it
it wrapped up
and all in, cost less than a quarter which isn’t too bad
Other than the fact it invented a Jaeger collector endpoint for traces (I’ll later use Zipkin), it did a good job
$ cat alloy.config
logging {
level = "info"
format = "logfmt"
}
// OTLP receiver for metrics, logs, and traces
otelcol.receiver.otlp "default" {
grpc {
endpoint = "0.0.0.0:4317"
}
http {
endpoint = "0.0.0.0:4318"
}
output {
metrics = [otelcol.processor.batch.default.input]
logs = [otelcol.processor.batch.default.input]
traces = [otelcol.processor.batch.default.input]
}
}
// Batch processor for performance optimization
otelcol.processor.batch "default" {
output {
metrics = [otelcol.exporter.prometheus.default.input]
logs = [otelcol.exporter.loki.default.input]
traces = [otelcol.exporter.otlp.traces.input]
}
}
// Prometheus exporter for metrics
otelcol.exporter.prometheus "default" {
forward_to = [prometheus.remote_write.prometheus.receiver]
}
// Loki exporter for logs
otelcol.exporter.loki "default" {
forward_to = [loki.write.loki.receiver]
}
// OTLP exporter for traces (configure endpoint as needed)
otelcol.exporter.otlp "traces" {
client {
endpoint = "http://jaeger-collector:14317"
tls {
insecure = true
}
}
}
// Prometheus remote write configuration
prometheus.remote_write "prometheus" {
endpoint {
url = "http://prometheus-server.grafana.svc.cluster.local/api/v1/write"
}
}
// Loki write configuration
loki.write "loki" {
endpoint {
url = "http://loki-gateway.grafana.svc.cluster.local/loki/api/v1/push"
}
Zipkin
Note to reader: I later decide to just use Tempo and Grafana as the Alloy collector does NOT have a zipkin exporter
In fact, let’s sort that out now:
I’ll fire off a quick deployment of zipkin
$ kubectl create deployment zipkin --image openzipkin/zipkin -n grafana
deployment.apps/zipkin created
Then expose it on a nodeport
$ kubectl expose deployment zipkin -n grafana --type=NodePort --target-port=9411 --port=9411 --name zipki
n-svc-ext
service/zipkin-svc-ext exposed
$ kubectl get svc -n grafana zipkin-svc-ext
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
zipkin-svc-ext NodePort 10.43.91.230 <none> 9411:31568/TCP 21s
I’ll be real exacting but give Gemini a chance to fix up the file for jaeger to zipkin traces
Oh good, it caught both the exporter type and output block which I might have missed:
Now my file looks like this:
$ cat alloy.config
logging {
level = "info"
format = "logfmt"
}
// OTLP receiver for metrics, logs, and traces
otelcol.receiver.otlp "default" {
grpc {
endpoint = "0.0.0.0:4317"
}
http {
endpoint = "0.0.0.0:4318"
}
output {
metrics = [otelcol.processor.batch.default.input]
logs = [otelcol.processor.batch.default.input]
traces = [otelcol.processor.batch.default.input]
}
}
// Batch processor for performance optimization
otelcol.processor.batch "default" {
output {
metrics = [otelcol.exporter.prometheus.default.input]
logs = [otelcol.exporter.loki.default.input]
traces = [otelcol.exporter.zipkin.default.input]
}
}
// Prometheus exporter for metrics
otelcol.exporter.prometheus "default" {
forward_to = [prometheus.remote_write.prometheus.receiver]
}
// Loki exporter for logs
otelcol.exporter.loki "default" {
forward_to = [loki.write.loki.receiver]
}
// Zipkin exporter for traces
otelcol.exporter.zipkin "default" {
endpoint = "http://192.168.1.33:31568/api/v2/spans"
}
// Prometheus remote write configuration
prometheus.remote_write "prometheus" {
endpoint {
url = "http://prometheus-server.grafana.svc.cluster.local/api/v1/write"
}
}
// Loki write configuration
loki.write "loki" {
endpoint {
url = "http://loki-gateway.grafana.svc.cluster.local/loki/api/v1/push"
}
}
It’s minor, but before i go whacking the CM and rotating pods, I need to ensure Alloy doesn’t come back in and replace it later.
This means setting the “create” to “false” in the helm values
$ cat ../jekyll-blog/grafana.values.yaml
alloy:
configMap:
create: false
key: config.alloy
name: alloy-config
extraPorts:
- name: otelgrpc
port: 4317
protocol: TCP
targetPort: 4317
- name: otelhttp
port: 4318
protocol: TCP
targetPort: 4318
- name: zipkinhttp
port: 9411
protocol: TCP
targetPort: 9411
$ helm upgrade -n grafana alloy -f ../jekyll-blog/grafana.values.yaml grafana/alloy
Release "alloy" has been upgraded. Happy Helming!
NAME: alloy
LAST DEPLOYED: Fri Jul 11 08:00:21 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Welcome to Grafana Alloy!
I can then create the alloy config from the file
$ kubectl create configmap -n grafana alloy-config "--from-file=config.alloy=./alloy.config"
configmap/alloy-config created
And see it created
$ kubectl get cm -n grafana
NAME DATA AGE
alloy-config 1 8s
grafana 1 39h
kube-root-ca.crt 1 334d
loki 1 29h
loki-gateway 1 29h
loki-minio 5 29h
loki-runtime 1 29h
prometheus-alertmanager 1 29h
prometheus-server 6 29h
I rotated the pods
builder@DESKTOP-QADGF36:~/Workspaces/alloy-setup$ kubectl get po -n grafana -l app.kubernetes.io/instance=alloy
NAME READY STATUS RESTARTS AGE
alloy-2h559 2/2 Running 0 39h
alloy-6cqcw 2/2 Running 0 39h
alloy-bbqvt 2/2 Running 0 39h
alloy-l467k 2/2 Running 0 39h
builder@DESKTOP-QADGF36:~/Workspaces/alloy-setup$ kubectl delete po -n grafana -l app.kubernetes.io/instance=alloy && sleep 10 && kubectl get po -n grafana -l app.kubernetes.io/instance=alloy
pod "alloy-2h559" deleted
pod "alloy-6cqcw" deleted
pod "alloy-bbqvt" deleted
pod "alloy-l467k" deleted
NAME READY STATUS RESTARTS AGE
alloy-cqfmv 1/2 CrashLoopBackOff 1 (8s ago) 13s
alloy-kdcft 1/2 CrashLoopBackOff 1 (5s ago) 12s
alloy-m6tqr 1/2 CrashLoopBackOff 1 (11s ago) 14s
alloy-qr5js 1/2 CrashLoopBackOff 1 (11s ago) 14s
But saw there was an error in the logs
$ kubectl logs alloy-cqfmv -n grafana
Error: /etc/alloy/config.alloy:41:1: cannot find the definition of component name "otelcol.exporter.zipkin"
40 | // Zipkin exporter for traces
41 | otelcol.exporter.zipkin "default" {
| ^^^^^^^^^^^^^^^^^^^^^^^
42 | endpoint = "http://192.168.1.33:31568/api/v2/spans"
Error: /etc/alloy/config.alloy:26:16: component "otelcol.exporter.zipkin.default.input" does not exist or is out of scope
25 | logs = [otelcol.exporter.loki.default.input]
26 | traces = [otelcol.exporter.zipkin.default.input]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27 | }
interrupt received
Error: could not perform the initial load successfully
It would seem that out of the box, Zipkin does not consume OTLP (though there are add ons) so we cannot use the otelcol.exporter.otlp
. Scrolling the list of exporters
in Grafana Docs, at present, there are no exporters for Zipkin (the format)
Tempo
I think that’s so you use one of their newer OS projects, Tempo.
$ helm install tempo -n grafana grafana/tempo
NAME: tempo
LAST DEPLOYED: Sat Jul 12 12:03:03 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 1
TEST SUITE: None
OOTB, that should support OTLP:
receivers:
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14250
thrift_binary:
endpoint: 0.0.0.0:6832
thrift_compact:
endpoint: 0.0.0.0:6831
thrift_http:
endpoint: 0.0.0.0:14268
opencensus: null
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
I can fix and rotate the pods
$ kubectl delete cm alloy-config -n grafana && kubectl create configmap -n grafana alloy-config "--from-file=config.alloy=./alloy.config" && kubectl get po -n grafana -l app.kubernetes.io/instance=alloy && kubectl delete po -n grafana -l app.kubernetes.io/instance=alloy && sleep 10 && kubectl get po -n grafana -l app.kubernetes.io/instance=alloy
configmap "alloy-config" deleted
configmap/alloy-config created
NAME READY STATUS RESTARTS AGE
alloy-cqfmv 1/2 CrashLoopBackOff 8 (4m28s ago) 20m
alloy-kdcft 1/2 CrashLoopBackOff 8 (4m6s ago) 20m
alloy-m6tqr 1/2 CrashLoopBackOff 8 (4m29s ago) 20m
alloy-qr5js 1/2 CrashLoopBackOff 8 (4m16s ago) 20m
pod "alloy-cqfmv" deleted
pod "alloy-kdcft" deleted
pod "alloy-m6tqr" deleted
pod "alloy-qr5js" deleted
NAME READY STATUS RESTARTS AGE
alloy-9mrxk 1/2 Running 0 12s
alloy-gktqn 1/2 Running 0 12s
alloy-tw728 1/2 Running 0 12s
alloy-vgjf7 1/2 Running 0 11s
I want a reachable port for Gemini to use outside the cluster so I’m going to try and create a NodePort service for that which is ClusterIP
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: networking
app.kubernetes.io/instance: alloy
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: alloy
app.kubernetes.io/part-of: alloy
app.kubernetes.io/version: v1.9.2
helm.sh/chart: alloy-1.1.2
name: alloy-svc-nodeport
namespace: grafana
spec:
type: NodePort
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http-metrics
port: 12345
protocol: TCP
targetPort: 12345
- name: otelgrpc
port: 4317
protocol: TCP
targetPort: 4317
- name: otelhttp
port: 4318
protocol: TCP
targetPort: 4318
- name: zipkinhttp
port: 9411
protocol: TCP
targetPort: 9411
selector:
app.kubernetes.io/instance: alloy
app.kubernetes.io/name: alloy
sessionAffinity: None
I didn’t set the ports, expecting K3s to determine and provide them for me after I apply
$ kubectl apply -f ./alloy-svc-nodeport.yaml -n grafana
service/alloy-svc-nodeport created
$ kubectl get svc alloy-svc-nodeport -n grafana
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alloy-svc-nodeport NodePort 10.43.215.63 <none> 12345:30367/TCP,4317:30921/TCP,4318:31235/TCP,9411:30093/TCP 65s
Gemini to Alloy
While we can create a markdown file (GEMINI.md) in our local remote or a top level ~/.gemini
folder, we can also handle some settings in a settings.json
file
builder@DESKTOP-QADGF36:~/Workspaces/pybsposter/.gemini$ ls
GEMINI.md
builder@DESKTOP-QADGF36:~/Workspaces/pybsposter/.gemini$ vi ~/.gemini/
google_account_id installation_id oauth_creds.json settings.json tmp/
Initially, all that is there is our theme and auth type. Now I’m going to add telemetry
as well
$ cat ~/.gemini/settings.json
{
"theme": "Shades Of Purple",
"selectedAuthType": "oauth-personal",
"telemetry": {
"enabled": true,
"target": "local",
"otlpEndpoint": "http://192.168.1.33:30921",
"logPrompts": true
}
}
One other quick change I did was add “GOOGLE_CLOUD_PROJECT=myanthosproject2” to ~/.env so it would remember it between sessions.
Now I can fire up Gemini and make an ask
It proposed some fixes
Again, I just love making Gemini a surly angry developer - I asked for this in my Gemini.md, mind you
I see no metrics nor logs in Grafana so I checked Alloy logs.
Debugging configuration issues
The first issue i see relates to Prometheus
$ kubectl logs alloy-vgjf7 -n grafana | tail -n 10
ts=2025-07-12T17:14:53.125345325Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=0e3dd0b6a561d7c554b39ebb13cfdc15 node_id=otelcol.exporter.otlphttp.tempo duration=43.161µs
ts=2025-07-12T17:14:53.125530754Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=0e3dd0b6a561d7c554b39ebb13cfdc15 node_id=otelcol.processor.batch.default duration=156.905µs
ts=2025-07-12T17:14:53.12565605Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=0e3dd0b6a561d7c554b39ebb13cfdc15 node_id=otelcol.receiver.otlp.default duration=90.33µs
ts=2025-07-12T17:14:53.125704882Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=0e3dd0b6a561d7c554b39ebb13cfdc15 node_id=tracing duration=20.037µs
ts=2025-07-12T17:14:53.125728566Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id="" trace_id=0e3dd0b6a561d7c554b39ebb13cfdc15 duration=2.747739ms
ts=2025-07-12T17:14:53.125784542Z level=info msg="config reloaded" service=http
ts=2025-07-12T17:14:53.125968878Z level=info msg="scheduling loaded components and services"
ts=2025-07-12T18:11:43.013625508Z level=info msg="Done replaying WAL" component_path=/ component_id=prometheus.remote_write.prometheus subcomponent=rw remote_name=6cd6b9 url=http://prometheus-server.grafana.svc.cluster.local/api/v1/write duration=59m50.036241866s
ts=2025-07-12T18:11:43.641318664Z level=error msg="non-recoverable error" component_path=/ component_id=prometheus.remote_write.prometheus subcomponent=rw remote_name=6cd6b9 url=http://prometheus-server.grafana.svc.cluster.local/api/v1/write failedSampleCount=2 failedHistogramCount=0 failedExemplarCount=0 err="server returned HTTP status 404 Not Found: remote write receiver needs to be enabled with --web.enable-remote-write-receiver\n"
ts=2025-07-12T18:11:53.643882427Z level=error msg="non-recoverable error" component_path=/ component_id=prometheus.remote_write.prometheus subcomponent=rw remote_name=6cd6b9 url=http://prometheus-server.grafana.svc.cluster.local/api/v1/write failedSampleCount=20 failedHistogramCount=0 failedExemplarCount=0 err="server returned HTTP status 404 Not Found: remote write receiver needs to be enabled with --web.enable-remote-write-receiver\n"
Another issue comes from logs going to Loki and not having the Org ID
$ kubectl logs alloy-tw728 -n grafana | tail -n 10
ts=2025-07-12T18:17:31.561629028Z level=error msg="final error sending batch" component_path=/ component_id=loki.write.loki component=client host=loki-gateway.grafana.svc.cluster.local status=401 tenant="" error="server returned HTTP status 401 Unauthorized (401): no org id"
ts=2025-07-12T18:17:52.066427905Z level=error msg="final error sending batch" component_path=/ component_id=loki.write.loki component=client host=loki-gateway.grafana.svc.cluster.local status=401 tenant="" error="server returned HTTP status 401 Unauthorized (401): no org id"
The first can be done with values.
While Sonnet suggested
prometheus:
server:
extraFlags:
- web.enable-remote-write-receiver
Reviewing the helm values (with --all
), at least the community edition, just uses the “server”
I set some values and upgraded my Prometheus (keeping the existing enable-lifecycle i saw there by default)
$ cat ./values.yaml
server:
extraFlags:
- web.enable-lifecycle
- web.enable-remote-write-receiver
$ helm upgrade prometheus -n grafana -f ./values.yaml prometheus-community/prometheus
Release "prometheus" has been upgraded. Happy Helming!
NAME: prometheus
LAST DEPLOYED: Sat Jul 12 13:35:06 2025
NAMESPACE: grafana
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-server.grafana.svc.cluster.local
Get the Prometheus server URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app.kubernetes.io/name=prometheus,app.kubernetes.io/instance=prometheus" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 9090
The Prometheus alertmanager can be accessed via port 9093 on the following DNS name from within your cluster:
prometheus-alertmanager.grafana.svc.cluster.local
Get the Alertmanager URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app.kubernetes.io/name=alertmanager,app.kubernetes.io/instance=prometheus" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 9093
#################################################################################
###### WARNING: Pod Security Policy has been disabled by default since #####
###### it deprecated after k8s 1.25+. use #####
###### (index .Values "prometheus-node-exporter" "rbac" #####
###### . "pspEnabled") with (index .Values #####
###### "prometheus-node-exporter" "rbac" "pspAnnotations") #####
###### in case you still need it. #####
#################################################################################
The Prometheus PushGateway can be accessed via port 9091 on the following DNS name from within your cluster:
prometheus-prometheus-pushgateway.grafana.svc.cluster.local
Get the PushGateway URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace grafana -l "app=prometheus-pushgateway,component=pushgateway" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace grafana port-forward $POD_NAME 9091
For more information on running Prometheus, visit:
https://prometheus.io/
The other fix was to add a header to the loki write block in the alloy.config
// Loki write configuration
loki.write "loki" {
endpoint {
url = "http://loki-gateway.grafana.svc.cluster.local/loki/api/v1/push"
headers = {
"X-Scope-OrgID" = "foo",
}
}
}
I then updated the config and rotated the Alloy pods to pull it in
builder@DESKTOP-QADGF36:~/Workspaces/alloy-setup$ kubectl delete cm alloy-config -n grafana && kubectl create configmap -n grafana alloy-config "--from-file=config.alloy=./alloy.config" && kubectl get po -n grafana -l app.kubernetes.io/instance=alloy && kubectl delete po -n grafana -l app.kubernetes.io/instance=alloy && sleep 10 && kubectl get po -n grafana -l app.kubernetes.io/instance=alloy
configmap "alloy-config" deleted
configmap/alloy-config created
NAME READY STATUS RESTARTS AGE
alloy-9mrxk 2/2 Running 0 85m
alloy-gktqn 2/2 Running 0 85m
alloy-tw728 2/2 Running 0 85m
alloy-vgjf7 2/2 Running 0 85m
pod "alloy-9mrxk" deleted
pod "alloy-gktqn" deleted
pod "alloy-tw728" deleted
pod "alloy-vgjf7" deleted
NAME READY STATUS RESTARTS AGE
alloy-6wknt 1/2 Running 0 11s
alloy-dpgx9 1/2 Running 0 12s
alloy-gsmng 2/2 Running 0 12s
alloy-sjz8b 2/2 Running 0 12s
I’ll ask it to tweak the README.md again
Now that it’s finished
As I make updates, I keep using Gemini to test.
Here I asked it to render out the diagrams
Open Telemetry data in Grafana
Now that we have Gemini sending telemetry data to Alloy which sends metrics to Prometheus, Logs to Loki and hopefully traces to Tempo, we should be able to view the Metrics from Prometheus in Grafana itself.
For instance, using the query gemini_cli_token_usage_total{model="gemini-2.5-pro"}
we can view our token usage for gemini-2.5-pro (versus flash) for the last hour
Or, perhaps I care to track how many of my daily 1000 requests I’ve used
Dashboards in Grafana
Let’s create a new Dashboard first
I’m just going to skip adding things here and save it
I’ll call it “Gemini Usage”
Back on the metrics explorer, I can add to a dashboard and select this existing one from the list
I’m going to tweak the graph a bit to sum by type so we pull all sessions together
Loki (Logs) data
We can see Info and Errors by volume via our Loki data source
Or log lines returned
I think this is a pretty good dashboard now for viewing my Gemini Usage, at least when local
If I wanted to, i would just need to punch a hole in my firewall to allow traffic to flow in and reach the 192.168.1.33 node (or any in the cluster) on port 30921 and I could keep fetching OTLP data when I’m remote
$ cat ~/.gemini/settings.json
{
"theme": "Shades Of Purple",
"selectedAuthType": "oauth-personal",
"telemetry": {
"enabled": true,
"target": "local",
"otlpEndpoint": "http://192.168.1.33:30921",
"logPrompts": true
}
}
Tempo
I added the tempo http listen port
$ helm get values --all tempo -n grafana | grep listen
http_listen_port: 3200
I could view a trace with some poking. I didnt see much by way of details so I didn’t add it to my dashboard
From my eyes, it seemed like just a simple OAuth flow
Summary
Let’s review the setup here.
First, we setup the Open Telemetry collector with a config that could send metrics to Prometheus, traces to Tempo, logs to Loki and then bring them all together in Grafana for reporting and visualization.
In the end we set it up so we could collect data and view it all in a nice Grafana dashboard
Next Steps
If we wished to setup SMTP to use something like Resend or Sendgrid, we would update the email settings in General settings
Then define some alerting rules which could trigger off our metrics