Instrumenting Go Services with Prometheus: The Right Way
Backend Developer | Golang & Python I enjoy building reliable APIs, distributed systems, and automation tools. Writing here about backend engineering, system design, and real-world dev experiences.
Introduction
Logs tell you what happened. Metrics tell you how often it happened and how long it took.
Here's the thing: logs are great for debugging specific issues, but they're terrible for answering questions like:
How many requests per second are we handling?
What's our 99th percentile latency?
How many errors happened in the last hour?
For those questions, you need metrics. And in the Go world, Prometheus is the de facto standard.
In this post, I'll show you how to instrument your Go services with Prometheus. We'll cover the core metric types, how to instrument HTTP handlers, and some real-world patterns I use in production.
If you're running microservices and you're not collecting metrics yet, this is your wake-up call.
Why Prometheus?
There are other metrics systems out there—Datadog, New Relic, CloudWatch—but Prometheus has some unique advantages:
Pull-based: Services expose metrics, Prometheus scrapes them. Simple, stateless, easy to debug.
Open-source: No vendor lock-in, runs anywhere.
Powerful queries: PromQL lets you slice metrics however you want.
Ecosystem: Grafana integration, alerting, tons of exporters.
Plus, it's the industry standard for cloud-native apps. Learn it once, use it everywhere.
Core Metric Types
Prometheus has 4 metric types. Understanding when to use each is critical.
Counter
A counter only goes up. Think "total requests", "total errors", "total bytes sent".
var requestsTotal = promauto.NewCounter(prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total HTTP requests",
})
// In your handler:
requestsTotal.Inc()
Counters are useful with the rate() function:
rate(http_requests_total[5m]) # requests per second over last 5 min
Gauge
A gauge can go up or down. Think "active connections", "memory usage", "queue size".
var activeConnections = promauto.NewGauge(prometheus.GaugeOpts{
Name: "http_active_connections",
Help: "Current active HTTP connections",
})
// When connection opens:
activeConnections.Inc()
// When connection closes:
activeConnections.Dec()
Histogram
A histogram samples observations (usually durations or sizes) and counts them in buckets. Think "request duration", "response size".
var requestDuration = promauto.NewHistogram(prometheus.HistogramOpts{
Name: "http_request_duration_seconds",
Help: "HTTP request duration",
Buckets: prometheus.DefBuckets, // 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10
})
// In your handler:
start := time.Now()
// ... do work ...
requestDuration.Observe(time.Since(start).Seconds())
Histograms let you calculate percentiles:
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) # p99 latency
Summary
A summary is similar to a histogram but calculates quantiles on the client side. I rarely use these—histograms are more flexible and can be aggregated across instances.
Instrumenting HTTP Handlers
Here's a complete example of instrumenting an HTTP service:
package main
import (
"net/http"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
var (
httpRequestsTotal = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total HTTP requests by path and status",
},
[]string{"path", "status"},
)
httpDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "http_request_duration_seconds",
Help: "HTTP request duration by path",
Buckets: prometheus.DefBuckets,
},
[]string{"path"},
)
)
func main() {
// Expose /metrics endpoint
http.Handle("/metrics", promhttp.Handler())
// Your API handlers
http.HandleFunc("/", metricsMiddleware(helloHandler))
http.HandleFunc("/api/orders", metricsMiddleware(ordersHandler))
http.ListenAndServe(":8080", nil)
}
func metricsMiddleware(next http.HandlerFunc) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
// Wrap the ResponseWriter to capture status code
ww := &responseWriter{ResponseWriter: w, statusCode: http.StatusOK}
// Call the actual handler
next.ServeHTTP(ww, r)
// Record metrics after handler completes
duration := time.Since(start).Seconds()
httpDuration.WithLabelValues(r.URL.Path).Observe(duration)
httpRequestsTotal.WithLabelValues(r.URL.Path, http.StatusText(ww.statusCode)).Inc()
}
}
// Wrapper to capture status code
type responseWriter struct {
http.ResponseWriter
statusCode int
}
func (rw *responseWriter) WriteHeader(code int) {
rw.statusCode = code
rw.ResponseWriter.WriteHeader(code)
}
func helloHandler(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello!"))
}
func ordersHandler(w http.ResponseWriter, r *http.Request) {
// simulate some work
time.Sleep(50 * time.Millisecond)
w.Write([]byte(`{"orders": []}`))
}
Now if you hit http://localhost:8080/metrics, you'll see:
# HELP http_requests_total Total HTTP requests by path and status
# TYPE http_requests_total counter
http_requests_total{path="/",status="OK"} 42
http_requests_total{path="/api/orders",status="OK"} 17
# HELP http_request_duration_seconds HTTP request duration by path
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{path="/",le="0.005"} 40
http_request_duration_seconds_bucket{path="/",le="0.01"} 42
http_request_duration_seconds_sum{path="/"} 0.123
http_request_duration_seconds_count{path="/"} 42
That's what Prometheus scrapes every 15 seconds (or whatever interval you configure).
Setting Up Prometheus
Create a prometheus.yml config file:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'my-go-app'
static_configs:
- targets: ['localhost:8080']
Run Prometheus with Docker:
docker run -p 9090:9090 \
-v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus
Now open http://localhost:9090 and you can query your metrics.
Useful PromQL Queries
Here are the queries I use most often:
Requests per second
rate(http_requests_total[5m])
Error rate (assuming 5xx = errors)
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
99th percentile latency
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
50th percentile (median)
histogram_quantile(0.5, rate(http_request_duration_seconds_bucket[5m]))
Top 5 slowest endpoints
topk(5, histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])))
Requests by status code
sum by (status) (rate(http_requests_total[5m]))
Best Practices
Use Labels Wisely
Labels are powerful, but they can explode your cardinality if you're not careful.
Good labels:
path(limited set of routes)status(limited set of HTTP codes)method(GET, POST, PUT, DELETE)
Bad labels:
user_id(unbounded, could be millions)request_id(unique per request)trace_id(unique per request)
If you add high-cardinality labels, you'll run out of memory fast.
Choose the Right Histogram Buckets
The default buckets (prometheus.DefBuckets) work for most cases:
[0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
But if your service has different characteristics, customize them:
Buckets: []float64{0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1} // for fast APIs
Buckets: []float64{1, 5, 10, 30, 60, 120} // for slow background jobs
Don't Instrument Everything
More metrics = more noise. Focus on what matters:
Request rate, latency, errors (the RED method)
Resource usage (CPU, memory, connections)
Business metrics (orders processed, payments completed)
Skip stuff like "function X was called"—that's what logs and tracing are for.
Use Counter, Not Gauge, for Totals
Common mistake: using a gauge for something that should be a counter.
Bad:
totalRequests = promauto.NewGauge(...)
totalRequests.Inc()
Good:
totalRequests = promauto.NewCounter(...)
totalRequests.Inc()
Counters are designed for this. They handle resets properly and work with rate() queries.
Real-World Example
Here's a more complete example from one of my production services:
package metrics
import (
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
)
var (
RequestsTotal = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "api_requests_total",
Help: "Total API requests",
},
[]string{"method", "path", "status"},
)
RequestDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "api_request_duration_seconds",
Help: "API request duration",
Buckets: []float64{0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5},
},
[]string{"method", "path"},
)
DatabaseQueriesTotal = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "db_queries_total",
Help: "Total database queries",
},
[]string{"query_type", "status"},
)
DatabaseQueryDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "db_query_duration_seconds",
Help: "Database query duration",
Buckets: []float64{0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1},
},
[]string{"query_type"},
)
CacheHitsTotal = promauto.NewCounter(prometheus.CounterOpts{
Name: "cache_hits_total",
Help: "Total cache hits",
})
CacheMissesTotal = promauto.NewCounter(prometheus.CounterOpts{
Name: "cache_misses_total",
Help: "Total cache misses",
})
)
// Track database query
func TrackDBQuery(queryType string, fn func() error) error {
start := time.Now()
err := fn()
duration := time.Since(start).Seconds()
status := "success"
if err != nil {
status = "error"
}
DatabaseQueriesTotal.WithLabelValues(queryType, status).Inc()
DatabaseQueryDuration.WithLabelValues(queryType).Observe(duration)
return err
}
Usage:
func GetOrder(ctx context.Context, orderID string) (*Order, error) {
var order *Order
err := metrics.TrackDBQuery("get_order", func() error {
return db.QueryRow("SELECT * FROM orders WHERE id = ?", orderID).Scan(&order)
})
return order, err
}
This gives you visibility into database performance, cache hit rates, API latency—everything you need to understand how your service is performing.
Integrating with Grafana
Prometheus is great for querying, but Grafana is better for dashboards.
- Run Grafana:
docker run -d -p 3000:3000 grafana/grafana
Add Prometheus as a data source (http://localhost:9090)
Create a dashboard with panels like:
Request rate (line graph)
Error rate (single stat)
P50/P99 latency (line graph)
Request breakdown by path (pie chart)
Now you have a real-time dashboard showing how your service is performing.
Common Pitfalls
High cardinality: Don't use unbounded labels (user IDs, request IDs).
Too many metrics: Focus on what's actionable.
Wrong metric type: Use counters for totals, histograms for durations.
Forgetting to expose /metrics: Prometheus can't scrape if the endpoint isn't exposed.
Not testing scraping: Use
curlhttp://localhost:8080/metricsto verify.
Wrapping Up
Metrics are essential for running production services. They let you answer questions like "is the service slow?" or "are we seeing more errors?" in seconds instead of hours.
Start simple: instrument your HTTP handlers with request count and duration. Once that's working, add database metrics, cache metrics, business metrics. Build it incrementally.
Next up, I'll cover distributed tracing with OpenTelemetry—because metrics tell you what is slow, but tracing tells you why.
Questions? Drop a comment. Always happy to talk about observability.
Resources
Thanks for reading! This is part of my series on building production-ready observability in Go. Follow along for more posts on tracing, logging, and alerting.