Back to all articles
Cloud2025-04-20· 9 min read

Serverless Architecture in 2025: When to Use It and How to Avoid the Pitfalls

A practical guide to serverless—the workloads it's genuinely great for, the situations where it costs more than it saves, and a real case study achieving 62% cost reduction and 99.99% uptime.

ServerlessFaaSCloud ArchitectureAzure FunctionsCost Optimisation
Serverless Architecture in 2025: When to Use It and How to Avoid the Pitfalls

Serverless: What It Is and What It Isn't

"Serverless" is a misleading name. Servers absolutely exist — you're just not responsible for provisioning, patching, or scaling them. The model is: write a function, define a trigger (HTTP request, queue message, scheduled time, database change), and the cloud provider handles everything else. You pay only for execution time, not idle compute.

The result is a fundamentally different cost structure: a service that handles 1,000 requests per month costs almost nothing. A service that handles 10 million costs proportionally more — without a fixed floor. For the right workloads, this is genuinely transformative.

Where Serverless Wins

Variable and Spiky Traffic

Traditional infrastructure is sized for peak load. If your peak is 10× your average, you're running 10× the necessary compute 90% of the time. Serverless scales from zero to peak and back automatically — you pay only for the actual load.

This makes it ideal for:

  • E-commerce: Traffic spikes during promotions and evenings; near-zero on Tuesday mornings
  • Event processing: Batch jobs that run nightly or in response to uploads
  • Webhooks: Low average volume but must handle bursts without queuing delays

New Products and Side Projects

Serverless eliminates the infrastructure bootstrap tax. No server to provision, no auto-scaling group to configure, no load balancer to set up. A new API is live in minutes from a function file and an HTTP trigger.

Functions in an Otherwise Containerised Architecture

Not every microservice needs to be a container. A notification service that sends emails based on order events, a thumbnail generator triggered by image uploads, or a scheduled data export job — these are natural serverless workloads even in a primarily Kubernetes environment.

Where Serverless Struggles

Latency-Sensitive Services

Cold starts are the most-cited serverless limitation, and they're real. When a function hasn't run recently, the runtime initialises from scratch. For .NET functions, this can be 800ms–2s. For Node.js, 200–400ms. For Rust compiled to WebAssembly, under 5ms.

Mitigations:

  • Provisioned concurrency (AWS) or pre-warmed instances (Azure) keep a baseline warm, but defeat the cost-to-zero benefit
  • Choose runtimes with fast cold starts: Go, Node.js, and Rust consistently outperform Java/.NET for cold-start latency

For a checkout API where p99 < 100ms is a requirement, a containerised service with predictable warm instances is the better choice.

Long-Running Processes

Most FaaS platforms cap execution at 15 minutes (AWS Lambda) or 10 minutes (Azure Functions consumption plan). File processing, ML inference, and video encoding typically need longer. Use dedicated compute here.

Stateful Workloads

Serverless functions are stateless by design. Any state must live in external storage: Redis, DynamoDB, Cosmos DB. If your workload requires complex in-memory state that's expensive to reload on each invocation, the overhead erodes the performance benefit.

A Production Serverless Architecture

Here's how we build a complete serverless API using AWS CDK — infrastructure as code that provisions everything in a single command:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as sqs from 'aws-cdk-lib/aws-sqs';

export class OrderApiStack extends cdk.Stack {
  constructor(scope: Construct, id: string) {
    super(scope, id);

    const ordersTable = new dynamodb.Table(this, 'Orders', {
      partitionKey: { name: 'orderId', type: dynamodb.AttributeType.STRING },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,  // True serverless pricing
      pointInTimeRecovery: true,
    });

    const dlq = new sqs.Queue(this, 'OrderDlq', {
      retentionPeriod: cdk.Duration.days(14),
    });

    const orderFn = new lambda.Function(this, 'OrderFunction', {
      runtime: lambda.Runtime.NODEJS_20_X,
      handler: 'index.handler',
      code: lambda.Code.fromAsset('dist'),
      memorySize: 512,
      timeout: cdk.Duration.seconds(10),
      environment: {
        TABLE_NAME: ordersTable.tableName,
        POWERTOOLS_SERVICE_NAME: 'order-api',
        LOG_LEVEL: 'INFO',
      },
      // Capture all errors in the dead-letter queue
      deadLetterQueue: dlq,
    });

    ordersTable.grantReadWriteData(orderFn);

    const api = new apigateway.RestApi(this, 'OrderApi', {
      deployOptions: {
        throttlingBurstLimit: 500,
        throttlingRateLimit: 1000,
      },
    });

    const orders = api.root.addResource('orders');
    orders.addMethod('POST', new apigateway.LambdaIntegration(orderFn));
    orders.addResource('{id}').addMethod('GET', new apigateway.LambdaIntegration(orderFn));
  }
}

A few production patterns worth highlighting: DynamoDB with PAY_PER_REQUEST billing — no capacity planning, true serverless pricing. A dead-letter queue captures every failed invocation for investigation. Memory at 512MB (Lambda CPU allocation scales linearly with memory — often the fastest way to improve cold start and throughput).

Case Study: Order Processing System Redesign

A retail client's order processing monolith was failing during promotions — queuing orders for minutes during flash sales, causing cart abandonment and a flood of customer service contacts.

We decomposed the monolith into five serverless functions:

  1. Order validation — triggered by HTTP POST, validates stock and customer data
  2. Payment processing — triggered by SQS queue from step 1
  3. Fulfilment notification — triggered by SQS queue from step 2
  4. Email confirmation — triggered by EventBridge from step 3
  5. Analytics ingest — triggered by DynamoDB Streams, eventually consistent

Each function scales independently. During Black Friday (40× normal traffic), only the validation and payment functions spiked — the others processed their queues at their own pace without affecting the user-facing flow.

Results after 90 days of production operation:

MetricBefore (Monolith)After (Serverless)
Operational costbaseline−62%
Uptime98.5%99.99%
Peak handling❌ crashes at 5×✅ handles 40×
Mean time to deploy2 hours8 minutes

The 62% cost reduction came from eliminating idle compute. The previous system ran EC2 instances 24/7 sized for peak. The serverless system costs nothing at 3am when no orders are coming in.

Observability: Don't Ship Serverless Without It

Distributed serverless systems are harder to debug than monoliths. Essential tooling:

  • AWS Lambda Powertools (Node.js/Python) or Azure Application Insights — structured logging, tracing, and metrics with minimal boilerplate
  • Correlate every log with a trace ID — propagate it across queue messages so you can trace a single order through all five functions
  • Alert on dead-letter queue depth — a growing DLQ is always a signal something is failing silently

Serverless isn't a simpler operations model — it's a different operations model. Invest in observability before your first production incident, not after.

Want to work together?

We build high-performance web applications and backend systems.

Get in touch