Top-N sort: fused SORT + LIMIT via binary heap

Binary max-heap of size K avoids full sort when LIMIT is present.

Setup

The benchmark is bench_subquery_complex: 50,000 event rows joined with a subquery filter, sorted, and limited. Measured as wall-clock latency through the PostgreSQL wire protocol, 10 iterations.

CREATE TABLE events (
    id INT, user_id INT, score INT,
    amount NUMERIC(10,2), label TEXT, ts TIMESTAMP
);
CREATE TABLE customers (
    id INT PRIMARY KEY, name TEXT, tier TEXT, region TEXT
);
-- 50,000 events, 2,000 customers

SELECT * FROM events
WHERE user_id IN (
    SELECT id FROM customers WHERE tier = 'premium'
) AND amount > 100
ORDER BY amount DESC
LIMIT 500;

Problem

Without the optimization, the executor sorts all qualifying rows (potentially tens of thousands) to return only 500. A full sort of N rows is O(N log N). When K « N, most of the sort work is wasted.

Cause

The plan executor originally treated ORDER BY and LIMIT as separate pipeline stages: PLAN_SORT sorted all input rows, then PLAN_LIMIT truncated the output. The sort stage had no knowledge of the downstream LIMIT and performed a full radix or comparison sort on the entire input.

Fix

The plan builder detects ORDER BY + LIMIT patterns and emits a single PLAN_TOP_N node that fuses both operations. The implementation in plan.c:

  1. Build phase: Maintain a binary max-heap of size K. For each input row, compare against the heap root (the worst row in the current top-K). If the new row is better, replace the root and sift down. Cost: O(N log K).
  2. Emit phase: Extract the K rows from the heap and sort them with pdqsort for final output order. Cost: O(K log K).

For this workload with K=500 and N=~25,000 qualifying rows: O(N log K) = 25,000 × 9 = 225,000 comparisons, versus O(N log N) = 25,000 × 15 = 375,000 for a full sort. The heap also avoids materializing all rows before sorting.

Result

Same query, same machine:

mskqlPostgreSQLDuckDB
subquery_complex (10 iter, LIMIT 500)7ms82ms12ms
vs PostgreSQL ratio0.09× (mskql 12× faster)
vs DuckDB ratio0.58× (mskql 1.7× faster)

The Top-N optimization matters most when K is small relative to N. As K approaches N, the benefit diminishes and the node degrades to a full sort with heap overhead. The plan builder only emits PLAN_TOP_N when LIMIT is present; unlimited ORDER BY uses PLAN_SORT with radix sort.