Back to all articles
Backend2025-03-15· 12 min read

Go + gRPC: Building Microservices That Handle Millions of Requests Efficiently

How to combine Go's concurrency model with gRPC's binary protocol for microservices that are fast, observable, and resilient—with practical patterns for service discovery, circuit breaking, and distributed tracing.

GogRPCMicroservicesPerformanceBackend
Go + gRPC: Building Microservices That Handle Millions of Requests Efficiently

Why Go and gRPC Belong Together

Go was built for network services. Its goroutine model handles 100,000+ concurrent connections on a single process while using a fraction of the memory a comparable Java or Node.js service would need. gRPC provides the communication layer: binary Protocol Buffer serialisation over HTTP/2, generating type-safe clients in any language from a single .proto file.

Together, they address the two biggest costs in high-throughput microservices: CPU time serialising/deserialising data (gRPC's Protobuf is 3–5× more efficient than JSON), and the overhead of managing thousands of simultaneous connections (Go's runtime scheduler does this natively).

Defining Your Service Contract

Everything in gRPC starts with a .proto file. This is your API contract — auto-generate clients in Go, TypeScript, Python, or a dozen other languages from the same source of truth:

syntax = "proto3";
package catalogue.v1;
option go_package = "github.com/org/services/catalogue/v1";

service CatalogueService {
  rpc GetProduct(GetProductRequest) returns (Product);
  rpc ListProducts(ListProductsRequest) returns (stream Product);
  rpc SearchProducts(SearchRequest) returns (SearchResponse);
}

message GetProductRequest {
  string product_id = 1;
}

message ListProductsRequest {
  int32 page_size = 1;
  string page_token = 2;
  string category_id = 3;
}

message Product {
  string id = 1;
  string name = 2;
  string description = 3;
  double price_eur = 4;
  int32 stock_qty = 5;
  repeated string image_urls = 6;
  google.protobuf.Timestamp created_at = 7;
}

Note ListProducts uses server-side streaming — the server sends a stream of Product messages, ideal for large paginated datasets without holding everything in memory.

Implementing the Server

package main

import (
    "context"
    "log/slog"
    "net"

    "google.golang.org/grpc"
    "google.golang.org/grpc/codes"
    "google.golang.org/grpc/status"
    pb "github.com/org/services/catalogue/v1"
)

type catalogueServer struct {
    pb.UnimplementedCatalogueServiceServer
    repo ProductRepository
}

func (s *catalogueServer) GetProduct(
    ctx context.Context,
    req *pb.GetProductRequest,
) (*pb.Product, error) {
    if req.ProductId == "" {
        return nil, status.Error(codes.InvalidArgument, "product_id is required")
    }

    product, err := s.repo.FindByID(ctx, req.ProductId)
    if err != nil {
        if errors.Is(err, ErrNotFound) {
            return nil, status.Errorf(codes.NotFound, "product %q not found", req.ProductId)
        }
        slog.ErrorContext(ctx, "failed to fetch product", "id", req.ProductId, "err", err)
        return nil, status.Error(codes.Internal, "internal error")
    }

    return toProto(product), nil
}

func (s *catalogueServer) ListProducts(
    req *pb.ListProductsRequest,
    stream pb.CatalogueService_ListProductsServer,
) error {
    products, err := s.repo.List(stream.Context(), req.CategoryId, req.PageSize)
    if err != nil {
        return status.Error(codes.Internal, "failed to list products")
    }

    for _, p := range products {
        if err := stream.Send(toProto(p)); err != nil {
            return err  // Client disconnected
        }
    }
    return nil
}

func main() {
    lis, err := net.Listen("tcp", ":50051")
    if err != nil {
        slog.Error("failed to listen", "err", err)
        os.Exit(1)
    }

    s := grpc.NewServer(
        grpc.ChainUnaryInterceptor(
            otelgrpc.UnaryServerInterceptor(),   // Distributed tracing
            recoveryInterceptor(),               // Panic recovery
            loggingInterceptor(),                // Structured logging
        ),
        grpc.ChainStreamInterceptor(
            otelgrpc.StreamServerInterceptor(),
        ),
    )

    pb.RegisterCatalogueServiceServer(s, &catalogueServer{repo: newPostgresRepo()})

    slog.Info("gRPC server listening", "addr", ":50051")
    if err := s.Serve(lis); err != nil {
        slog.Error("server failed", "err", err)
    }
}

Always embed UnimplementedCatalogueServiceServer — it future-proofs your server against new RPC methods added to the proto without breaking existing deployments.

Circuit Breaking for Resilience

A slow upstream service can cascade into a system-wide outage if callers keep queuing requests. A circuit breaker opens when failure rates exceed a threshold, failing fast and giving the upstream time to recover:

import "github.com/sony/gobreaker/v2"

type resilientCatalogueClient struct {
    inner pb.CatalogueServiceClient
    cb    *gobreaker.CircuitBreaker[*pb.Product]
}

func newResilientClient(conn *grpc.ClientConn) *resilientCatalogueClient {
    cb := gobreaker.NewCircuitBreaker[*pb.Product](gobreaker.Settings{
        Name: "catalogue-service",
        ReadyToTrip: func(counts gobreaker.Counts) bool {
            // Open after >60% failure rate with at least 5 requests
            return counts.Requests >= 5 &&
                float64(counts.TotalFailures)/float64(counts.Requests) > 0.6
        },
        OnStateChange: func(name string, from, to gobreaker.State) {
            slog.Info("circuit breaker state change",
                "service", name, "from", from, "to", to)
        },
        Timeout: 30 * time.Second,
    })

    return &resilientCatalogueClient{
        inner: pb.NewCatalogueServiceClient(conn),
        cb:    cb,
    }
}

func (c *resilientCatalogueClient) GetProduct(
    ctx context.Context,
    req *pb.GetProductRequest,
) (*pb.Product, error) {
    return c.cb.Execute(func() (*pb.Product, error) {
        return c.inner.GetProduct(ctx, req)
    })
}

Service Discovery with Consul

In a dynamic container environment, service IP addresses change constantly. Register services with Consul on startup and use gRPC's native name resolver to route to healthy instances:

func registerWithConsul(port int) {
    client, _ := api.NewClient(api.DefaultConfig())

    check := &api.AgentServiceCheck{
        GRPC:                           fmt.Sprintf("host.docker.internal:%d", port),
        Interval:                       "10s",
        Timeout:                        "2s",
        DeregisterCriticalServiceAfter: "30s",
    }

    client.Agent().ServiceRegister(&api.AgentServiceRegistration{
        ID:      fmt.Sprintf("catalogue-service-%d", port),
        Name:    "catalogue-service",
        Port:    port,
        Tags:    []string{"grpc", "v1"},
        Check:   check,
    })
}

On the client side, use consul://catalogue-service as the dial target with a Consul resolver plugin — gRPC's client-side load balancing distributes calls across all healthy replicas automatically.

Observability: Metrics, Tracing, and Logging

Distributed systems are opaque without instrumentation. The Go ecosystem has excellent OpenTelemetry support:

func initTelemetry(ctx context.Context, serviceName string) func() {
    // OTLP exporter → Grafana Tempo / Jaeger / Azure Monitor
    exporter, _ := otlptracegrpc.New(ctx,
        otlptracegrpc.WithEndpoint("otel-collector:4317"),
        otlptracegrpc.WithInsecure(),
    )

    tp := sdktrace.NewTracerProvider(
        sdktrace.WithBatcher(exporter),
        sdktrace.WithResource(resource.NewWithAttributes(
            semconv.SchemaURL,
            semconv.ServiceName(serviceName),
            semconv.ServiceVersion("v2.4.1"),
        )),
    )

    otel.SetTracerProvider(tp)
    otel.SetTextMapPropagator(propagation.TraceContext{})

    // Prometheus metrics via grpc-go interceptors
    grpcmetrics.EnableHandlingTimeHistogram()

    return func() { tp.Shutdown(ctx) }
}

Add structured logging with log/slog (standard library since Go 1.21) and you have a full observability stack without reaching for heavy third-party frameworks.

Benchmarks: What You're Getting

On a single c5.xlarge instance (4 vCPU, 8GB RAM), a Go/gRPC service comfortably handles:

  • 50,000+ requests/second for simple CRUD operations
  • p99 latency under 5ms with database connection pooling
  • ~20MB RSS memory for the service process itself
  • <10ms cold start in a containerised environment

The equivalent Node.js/Express service with JSON typically achieves 8,000–12,000 requests/second at comparable p99 latency. The throughput difference comes almost entirely from Protobuf serialisation overhead reduction and Go's efficient scheduler.

gRPC's streaming support also unlocks patterns JSON REST simply can't match efficiently — real-time product availability updates via server push, bidirectional chat, and large data exports without buffering entire result sets in memory.

Build the service once with the right foundation, and you'll rarely need to revisit it for performance reasons.

Want to work together?

We build high-performance web applications and backend systems.

Get in touch