The standard benchmarks use 5K–50K rows, which fit comfortably in CPU cache. Real analytical workloads often involve hundreds of thousands or millions of rows. The stress benchmarks push mskql to 500K rows per table to expose scaling bottlenecks in sorting, hashing, joining, and window functions.
All eight stress benchmarks run through the PostgreSQL wire protocol against mskql, PostgreSQL, and DuckDB (in-process CLI). Each benchmark runs a single iteration—no caching, no warm-up advantage.
| Workload | mskql | PG | Duck | ms/pg | ms/duck |
|---|---|---|---|---|---|
| stress_full_agg (500K rows) | 360ms | 764ms | 74ms | 0.47× | 4.90× |
| stress_high_card_gb (100K groups) | 249ms | 1,229ms | 245ms | 0.20× | 1.01× |
| stress_large_sort (500K rows) | 408ms | 557ms | 87ms | 0.73× | 4.71× |
| stress_join_2way (500K + 10K) | 350ms | 2,284ms | 109ms | 0.15× | 3.20× |
| stress_join_3way (3 large tables) | 478ms | 2,941ms | 128ms | 0.16× | 3.73× |
| stress_filtered_expr (WHERE + computed) | 298ms | 641ms | 134ms | 0.46× | 2.22× |
| stress_window (RANK, 500K rows) | 327ms | 482ms | 115ms | 0.68× | 2.85× |
| stress_nested_cte (3-level CTE) | 495ms | 1,057ms | 185ms | 0.47× | 2.67× |
mskql beats PostgreSQL on all 8 workloads, from 1.4× faster (stress_large_sort) to 6.5× faster (stress_join_2way and stress_join_3way). The join workloads show the largest gap: PostgreSQL’s hash join with 500K outer rows involves significant memory management overhead that mskql avoids with arena allocation.
DuckDB wins all 8 workloads over the wire, from 1.01× (stress_high_card_gb, essentially a tie) to 4.90× (stress_full_agg). This is expected: DuckDB runs in-process with zero wire overhead, while mskql serializes results through TCP. The in-process benchmarks (Section 2) show mskql winning 15 of 16 when both engines run in-process.
The stress_high_card_gb result (249ms vs 245ms, 1.01×) is
notable: mskql’s hash aggregation matches DuckDB’s SIMD-vectorized
implementation at 100K groups, even with wire protocol overhead.