The benchmark is bench_analytical_cte: a CTE pipeline with
joins, aggregation, and sorting. The query touches three tables through a
multi-stage plan. Measured as wall-clock latency through the PostgreSQL wire
protocol.
WITH order_summary AS (
SELECT o.customer_id,
SUM(o.quantity * p.price) AS total
FROM orders o
JOIN products p ON o.product_id = p.id
GROUP BY o.customer_id
)
SELECT c.region, COUNT(*), SUM(os.total) AS revenue
FROM order_summary os
JOIN customers c ON os.customer_id = c.id
GROUP BY c.region
ORDER BY revenue DESC;
The parser emits a struct query_select with ~40 boolean flags:
has_join, has_group_by, has_order_by,
has_distinct, has_having, has_window,
has_set_op, and so on. The physical plan builder
(plan.c) had to pattern-match this flat flag soup directly,
leading to fragile if/else chains that grew with every new feature. Adding
projection pushdown or predicate pushdown required touching every builder
path.
There was no intermediate representation between parsing and physical planning. The parser produced a flat record of everything the query mentioned. The planner consumed it directly. This meant every optimization had to understand the raw parser output, and rewrite rules (merge two filters, push a filter past a join) had nowhere to operate.
Added logical.c (~660 lines) and logical.h (~180 lines)
implementing a three-stage pipeline:
Stage 1: logical_build() — desugars the
parser AST into a minimal tree of 7 node types:
L_SCAN — table scan, subquery, CTE, generate_series, or set operationL_FILTER — predicate (WHERE or HAVING)L_PROJECT — column selection and expressionsL_JOIN — left + right children with explicit ON conditionL_AGGREGATE — GROUP BY keys + aggregate expressionsL_SORT — ORDER BYL_LIMIT — LIMIT + OFFSET
Sugar is fully lowered: HAVING becomes L_FILTER wrapping
L_AGGREGATE. DISTINCT becomes L_AGGREGATE with
all projected columns as group keys. DISTINCT ON desugars to
L_SORT + L_AGGREGATE(first_row_only=1).
Stage 2: logical_normalize() — applies
semantics-preserving tree rewrites in a single bottom-up pass:
L_FILTER(L_FILTER(x, p1), p2) → L_FILTER(x, p1 AND p2).
Combines adjacent filters into one, reducing plan tree depth.L_FILTER(L_JOIN(a, b), pred_on_a) → L_JOIN(L_FILTER(a, pred), b).
Moves single-column filters inside joins so the physical planner sees
L_FILTER(L_SCAN) on the left child, enabling index scan detection.L_PROJECT(L_PROJECT(x, c1), c2) → L_PROJECT(x, c2).
Removes redundant projections.
Stage 3: logical_to_physical() (in plan.c)
— pattern-matches the normalized logical tree to physical plan nodes
(PLAN_SEQ_SCAN, PLAN_HASH_JOIN,
PLAN_HASH_AGG, PLAN_TOP_N, etc.). This is where
projection pushdown narrows scan width per join table based on which columns
are actually needed.
The logical IR enabled two optimizations that would have been impractical to implement against the raw parser output:
The IR also made EXPLAIN (LOGICAL) possible: a human-readable
dump of the canonical tree before physical planning, useful for debugging
plan decisions.