Why Event-Driven Architecture Wins at Scale
Traditional REST-first microservices create hidden coupling: Service A calls Service B synchronously, which calls Service C, and suddenly you have a distributed monolith where a slow database query in Service C causes timeouts in Service A's checkout flow.
Event-driven architecture breaks this chain. Services emit events—"something happened"—and other services react asynchronously. No service knows or cares what subscribes to its events. The result is a system that degrades gracefully, scales independently, and evolves without cross-team coordination overhead.
Azure's Messaging Trifecta
Azure offers three messaging primitives, each with a distinct role:
| Service | Model | Best For |
|---|---|---|
| Service Bus | Queues + Topics | Reliable business message delivery, FIFO, dead-lettering |
| Event Grid | Pub/Sub routing | Reactive event routing, Azure resource events, webhooks |
| Event Hubs | Streaming | High-throughput telemetry, log aggregation, stream processing |
For most business-process event-driven systems, Service Bus Topics with Subscriptions are the right starting point. Event Hubs becomes relevant when you're ingesting millions of events per second (IoT, clickstream). Event Grid shines for reacting to Azure infrastructure events.
Designing Domain Events
Domain events capture meaningful changes in your business model. Three rules make them effective:
- Named in past tense —
OrderPlaced, notPlaceOrder - Immutable and self-contained — everything a consumer needs to react is in the event itself
- Versioned — add a
SchemaVersionfield from day one; you will change the schema
public sealed record OrderPlacedEvent
{
public required Guid EventId { get; init; } = Guid.NewGuid();
public required int SchemaVersion { get; init; } = 1;
public required Guid OrderId { get; init; }
public required string CustomerId { get; init; }
public required IReadOnlyList<OrderLineItem> Items { get; init; }
public required decimal TotalAmount { get; init; }
public required DateTimeOffset OccurredAt { get; init; } = DateTimeOffset.UtcNow;
}
Using sealed record gives you value equality, immutability, and non-nullable init-only properties enforced by the compiler.
Publishing Events: The Outbox Pattern
The most common event-driven bug is a partial failure: the order is saved to the database, but the event publish fails—or worse, the event is published before the database transaction commits.
The outbox pattern solves this by making event publishing part of the database transaction:
public class OutboxEventPublisher : IEventPublisher
{
private readonly AppDbContext _db;
public async Task PublishAsync<T>(T @event, CancellationToken ct = default)
where T : class
{
// Stored atomically with the business data in the same transaction
_db.OutboxMessages.Add(new OutboxMessage
{
Id = Guid.NewGuid(),
EventType = typeof(T).AssemblyQualifiedName!,
Payload = JsonSerializer.Serialize(@event),
CreatedAt = DateTimeOffset.UtcNow,
Status = OutboxStatus.Pending
});
// Caller calls SaveChangesAsync() — outbox row commits with the business data
}
}
A background worker (or Azure Function on a timer) reads pending outbox messages and publishes them to Service Bus:
public class OutboxProcessor(AppDbContext db, ServiceBusSender sender) : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var pending = await db.OutboxMessages
.Where(m => m.Status == OutboxStatus.Pending)
.OrderBy(m => m.CreatedAt)
.Take(20)
.ToListAsync(ct);
foreach (var msg in pending)
{
await sender.SendMessageAsync(new ServiceBusMessage(msg.Payload)
{
ContentType = "application/json",
Subject = msg.EventType,
MessageId = msg.Id.ToString()
}, ct);
msg.Status = OutboxStatus.Published;
msg.PublishedAt = DateTimeOffset.UtcNow;
}
await db.SaveChangesAsync(ct);
await Task.Delay(TimeSpan.FromSeconds(5), ct);
}
}
}
This pattern guarantees exactly-once delivery (combined with idempotent consumers) and at-least-once delivery even if the application crashes mid-publish.
Consuming Events with Azure Functions
Azure Functions with Service Bus triggers are a natural fit for event consumers—serverless, auto-scaling, and with built-in retry and dead-letter handling:
public class OrderFulfillmentHandler(IInventoryService inventory, INotificationService notify)
{
[Function("HandleOrderPlaced")]
public async Task RunAsync(
[ServiceBusTrigger("orders.placed", "fulfillment-sub",
Connection = "ServiceBusConnection")]
ServiceBusReceivedMessage message,
ServiceBusMessageActions actions)
{
var @event = JsonSerializer.Deserialize<OrderPlacedEvent>(message.Body)!;
try
{
await inventory.AllocateStockAsync(@event.Items);
await notify.SendOrderConfirmationAsync(@event.CustomerId, @event.OrderId);
// Explicitly complete — message removed from queue
await actions.CompleteMessageAsync(message);
}
catch (InsufficientStockException ex)
{
// Business-logic failure: dead-letter with reason, don't retry
await actions.DeadLetterMessageAsync(message,
deadLetterReason: "InsufficientStock",
deadLetterErrorDescription: ex.Message);
}
// Transient failures (network, DB unavailable) throw and trigger automatic retry
}
}
The distinction between dead-lettering and throwing matters. Dead-letter for business failures you've handled. Throw for transient failures you want the Service Bus retry policy to handle.
Retry Policies and Resilience
Configure Service Bus retry at the client level and layer Polly for downstream HTTP calls:
// Service Bus client — built-in retry with exponential backoff
services.AddSingleton(_ =>
new ServiceBusClient(
connectionString,
new ServiceBusClientOptions
{
RetryOptions = new ServiceBusRetryOptions
{
Mode = ServiceBusRetryMode.Exponential,
MaxRetries = 5,
Delay = TimeSpan.FromSeconds(1),
MaxDelay = TimeSpan.FromSeconds(60)
}
}));
// Polly pipeline for HTTP calls inside consumers
services.AddHttpClient<IInventoryClient, InventoryClient>()
.AddResilienceHandler("inventory", builder =>
{
builder.AddRetry(new HttpRetryStrategyOptions
{
BackoffType = DelayBackoffType.Exponential,
MaxRetryAttempts = 3
});
builder.AddCircuitBreaker(new HttpCircuitBreakerStrategyOptions
{
FailureRatio = 0.5,
SamplingDuration = TimeSpan.FromSeconds(30),
BreakDuration = TimeSpan.FromSeconds(15)
});
builder.AddTimeout(TimeSpan.FromSeconds(10));
});
Distributed Tracing with OpenTelemetry
Debugging event-driven systems without tracing is nearly impossible — a failed order could have touched six services before the error surface. Wire up OpenTelemetry from the start:
builder.Services.AddOpenTelemetry()
.WithTracing(tracing => tracing
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddAzureMonitorTraceExporter()); // Ships to Application Insights
Every Service Bus message automatically propagates the trace context when you use the Azure.Messaging.ServiceBus SDK — no manual correlation IDs needed. In Azure Monitor, you'll see the complete end-to-end call graph from HTTP request through every event handler.
When NOT to Use Event-Driven
Event-driven architecture adds operational complexity. Don't reach for it by default:
- Synchronous queries — if you need a response immediately (e.g., "is this username taken?"), use HTTP
- Simple CRUD services — events solve coordination problems between services; within a single service, they add noise
- Small teams / low scale — the overhead of outbox tables, dead-letter queues, and schema versioning isn't worth it until you have multiple services that genuinely need decoupling
The pattern shines when you have independent bounded contexts that change at different rates and need to communicate without tight coupling. Start with REST, extract events when the coupling pain becomes real.