Database Performance Optimisation: How We Cut Query Time from 1200ms to 85ms

The Problem No One Sees Until It's Too Late

Database performance rarely degrades all at once. It creeps up gradually: a query that took 80ms at launch takes 300ms after six months of data growth, then 900ms a year later. By the time users notice slowness, the problem is already severe.

We see this pattern repeatedly at MediaFront. The good news: most database performance issues stem from the same handful of root causes, and fixing them systematically delivers dramatic improvements without hardware upgrades.

Start With Measurement, Not Guesses

Every optimisation project begins the same way: instrument before you optimise. Guessing which query is slow wastes weeks. Your tools:

PostgreSQL — pg_stat_statements extension tracks cumulative query execution time across all calls
MySQL — the slow_query_log with long_query_time = 0.1 captures everything above 100ms
SQL Server — Query Store provides a GUI over the same data

-- Find your 10 most expensive queries in PostgreSQL
SELECT
  query,
  calls,
  round(total_exec_time::numeric, 2) AS total_ms,
  round(mean_exec_time::numeric, 2)  AS avg_ms,
  round(stddev_exec_time::numeric, 2) AS stddev_ms
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Fix the queries at the top of this list first. A 10× improvement on a query called 10,000 times per day is worth far more than a 100× improvement on one called 20 times.

Query Optimisation: The High-ROI Fixes

Always Run EXPLAIN ANALYZE

Before touching an index or schema, understand how the database executes the query:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
    c.customer_id,
    c.name,
    COUNT(o.order_id)     AS order_count,
    SUM(o.total_amount)   AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.signup_date > '2024-01-01'
GROUP BY c.customer_id, c.name
ORDER BY total_spent DESC
LIMIT 100;

Red flags in the output:

Seq Scan on a large table — almost always an index opportunity
Hash Join with large Batches — memory pressure, increase work_mem or add an index
High Buffers: shared read — data not in cache, consider a covering index

Stop Using SELECT *

This single habit kills performance in multiple ways: it fetches columns you'll never use, prevents covering-index optimisation, and bloats network payloads.

-- ❌ Fetches all 22 columns including large TEXT fields
SELECT * FROM products WHERE category_id = 5;

-- ✅ Fetches exactly what the UI renders
SELECT product_id, name, price, stock_qty
FROM products
WHERE category_id = 5;

Rewrite Correlated Subqueries as Joins

Correlated subqueries execute once per row of the outer query. On a table with 500,000 rows, a correlated subquery that takes 0.5ms runs for 250 seconds total.

-- ❌ Correlated subquery — O(n) database calls
SELECT name,
  (SELECT MAX(order_date) FROM orders WHERE customer_id = c.id) AS last_order
FROM customers c;

-- ✅ Single-pass join
SELECT c.name, o.last_order
FROM customers c
LEFT JOIN (
  SELECT customer_id, MAX(order_date) AS last_order
  FROM orders
  GROUP BY customer_id
) o ON c.id = o.customer_id;

Indexing Strategy

Indexes are the single highest-leverage tool in your optimisation arsenal—but only when used intentionally.

The Index Decision Framework

Create an index when:

The column appears in a WHERE, JOIN ON, or ORDER BY clause
The column has high cardinality (many distinct values)
The table has more than ~10,000 rows
The read/write ratio for this table favours reads

Avoid indexes when:

The table is write-heavy (each index slows every INSERT/UPDATE/DELETE)
The column has very low cardinality (e.g., a boolean is_active)
The query planner already uses a more selective index

Covering Indexes Eliminate Table Lookups

A covering index includes all columns a query needs, so PostgreSQL never touches the main table:

-- Query: find a customer's recent orders by status
SELECT order_id, order_date, total_amount
FROM orders
WHERE customer_id = 12345
  AND status = 'pending'
ORDER BY order_date DESC;

-- Covering index — all queried columns live in the index
CREATE INDEX idx_orders_customer_status_date_amount
  ON orders (customer_id, status, order_date DESC)
  INCLUDE (total_amount);

The INCLUDE clause (PostgreSQL 11+) adds total_amount to leaf pages without affecting sort order, making this a true zero-heap-access query.

Partial Indexes for Selective Conditions

When you only ever query a subset of rows, index only that subset:

-- Only active, unfulfilled orders matter for the operations dashboard
CREATE INDEX idx_orders_unfulfilled
  ON orders (customer_id, created_at)
  WHERE status IN ('pending', 'processing');

This index is a fraction of the size of a full index on those columns, making it faster to build, cheaper to maintain, and more likely to stay in memory.

Prune Dead Indexes

Unused indexes are pure cost: they consume disk, slow writes, and confuse the planner. Find them:

SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan AS scans_since_restart
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey'
ORDER BY schemaname, tablename;

Drop any index that shows zero scans and wasn't created in the last 24 hours.

Caching: The Force Multiplier

Once queries are fast, caching keeps frequently-read data out of the database entirely.

Application-Level Caching with Redis

public async Task<Product> GetProductAsync(int id)
{
    var cacheKey = $"product:{id}";

    var cached = await _cache.GetStringAsync(cacheKey);
    if (cached is not null)
        return JsonSerializer.Deserialize<Product>(cached)!;

    var product = await _db.Products
        .AsNoTracking()
        .Include(p => p.Category)
        .FirstOrDefaultAsync(p => p.Id == id);

    if (product is not null)
    {
        await _cache.SetStringAsync(
            cacheKey,
            JsonSerializer.Serialize(product),
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(15)
            });
    }

    return product!;
}

Cache TTL is the hardest decision. Our rule of thumb: start at half the expected staleness tolerance, then tune based on cache-hit ratio monitoring.

Materialised Views for Analytics

Complex analytical queries over large tables should never run live against OLTP data. Materialise the result:

CREATE MATERIALIZED VIEW mv_monthly_revenue AS
SELECT
    DATE_TRUNC('month', order_date)  AS month,
    product_category,
    SUM(amount)                       AS revenue,
    COUNT(*)                          AS orders
FROM orders o
JOIN products p ON o.product_id = p.id
GROUP BY 1, 2;

CREATE INDEX ON mv_monthly_revenue (month);

-- Refresh nightly (or on-demand after large data loads)
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_monthly_revenue;

CONCURRENTLY lets reads continue during the refresh — zero downtime.

Real-World Results: E-Commerce Platform Optimisation

An e-commerce client came to us handling 50,000+ daily orders with page timeouts during flash sales. Our four-week audit and remediation:

Action	Impact
Rewrote 15 highest-cost queries	−60% DB CPU usage
Added 8 strategic covering indexes	Avg query time: 1200ms → 85ms
Removed 5 unused indexes	Write throughput +18%
Added Redis for product catalogue	Cache hit rate 94%
Partitioned `orders` table by month	Flash sale timeouts eliminated

The business outcome: a 24% increase in conversion rate—driven entirely by improved responsiveness. No new hardware was purchased.

Schema Design: Getting It Right Upstream

Optimising a poorly designed schema is fighting uphill. Principles that pay off long-term:

Normalise transactional tables — eliminating update anomalies is worth the occasional join
Denormalise for reporting — create dedicated summary tables or materialised views rather than forcing complex joins into every report query
Partition large tables — partition orders and events tables by date range so old data never slows current queries
Use appropriate data types — BIGINT for IDs that will exceed 2 billion, TIMESTAMPTZ for timestamps (always store UTC), NUMERIC(10,2) for money (never FLOAT)

Database performance is a product of schema design, query patterns, and infrastructure working together. Fix all three, and you'll handle 10× the traffic on the same hardware.