Result cache and join cache

Wire-byte result cache + hash-join build-side cache.

Setup

The benchmark is bench_mixed_analytical under concurrent throughput testing: 8 threads, 5 seconds, each thread repeatedly executing the same CTE→join→aggregate→sort query. Measured as queries per second.

WITH order_summary AS (
    SELECT o.customer_id,
           SUM(o.quantity * p.price) AS total
    FROM orders o
    JOIN products p ON o.product_id = p.id
    GROUP BY o.customer_id
)
SELECT c.region, COUNT(*), SUM(os.total) AS revenue
FROM order_summary os
JOIN customers c ON os.customer_id = c.id
GROUP BY c.region
ORDER BY revenue DESC;
-- 8 threads, each running this query continuously for 5 seconds

Problem

Dashboard-style workloads poll the same analytical query repeatedly. Without caching, each poll re-executes the full pipeline: CTE materialization, hash join build, hash aggregate, sort, wire serialization. Even with the fast columnar executor, this limits throughput to the single-query latency.

Cause

Two sources of redundant work:

1. Identical queries re-execute the full pipeline. If the underlying tables have not changed, the result is the same. But the executor has no way to know this without re-running the query.

2. Hash join build-side reconstruction. For queries with joins, the build side (inner table) is scanned, hashed, and inserted into a hash table on every execution. When the inner table has not changed, this is wasted work.

Fix

Result cache (pgwire.c: rcache). After executing a SELECT query and serializing the result to wire bytes, the serialized bytes are stored in a hash map keyed by the SQL text. On the next identical query, the cache checks whether any table referenced by the query has been modified (via a generation counter on each table). If no table has changed, the cached wire bytes are sent directly to the client—bypassing parsing, planning, execution, and serialization entirely.

Join cache (table.c: join_cache). The hash join build-side hash table is retained across queries. Each table tracks a generation counter that increments on INSERT, UPDATE, or DELETE. The join cache stores the build-side hash table along with the generation counter at build time. On the next query, if the generation counter matches, the cached hash table is reused without rebuilding.

Cache invalidation is exact: any mutation to any referenced table invalidates the result cache entry. There are no TTLs, no stale reads, and no configuration knobs.

Result

Same query, same machine, 8 threads:

	mskql	PostgreSQL	Ratio
mixed_analytical QPS	97,210	508	191×
p50 latency	0.082ms	15.7ms	—
wide_aggregate QPS	97,008	1,055	92×
group_sum QPS	97,210	14,305	6.8×

At 97,210 QPS, each query completes in 82 microseconds. The result cache turns any repeated read-only query into a hash lookup + socket write. The 191× ratio over PostgreSQL is the highest on the throughput benchmark suite.

The cache is most effective for read-heavy analytical dashboards where the same queries run repeatedly against slowly-changing data. For write-heavy workloads, frequent invalidation reduces the hit rate and the throughput advantage narrows (mixed read/write: 0.94× PostgreSQL).