mskql

An analytics SQL engine in C. Faster than PostgreSQL on every cached batch workload. Faster than DuckDB on 61 of 93. ~45.1K lines. Zero dependencies.

Martin S. Kristiansen  ·  github.com/martinsk/mskql  ·  Tutorials


An in-memory SQL database that speaks the PostgreSQL wire protocol. Fast because it is simple: one .c file per subsystem, six memory regions with explicit ownership, no buffer pool, no WAL, no hidden planner. The scan cache converts row-store to flat columnar arrays once per table write; subsequent queries read contiguous memory. The result cache serves repeated queries in microseconds. The wire path serializes columnar blocks directly to TCP—no row materialization. These are not optimizations layered on top of a complex system. They are what happens when you remove the complexity. Connect with psql on port 5433 or try it in the browser via WebAssembly.

Performance: mskql vs PostgreSQL vs DuckDB

93 batch workloads and 15 concurrent throughput workloads, measured on the same machine. Faster than PostgreSQL on every cached batch workload and 13 of 15 throughput tests. Against DuckDB, the engine wins 61 of 93 workloads—aggregates 8×, expression aggregates 10×, correlated subqueries 166× faster than PG. DuckDB wins full scans (4.3×) and sorts (4.7×).

The performance advantage traces to three structural facts: zero per-query allocation after warmup, a scan cache that converts row-store to flat columnar arrays once per table write, and a wire path that batches 64 KB before a single write() syscall.

Batch latency — shorter bars = faster

mskql PostgreSQL DuckDB
Correlated subquery
mskql
22ms
PG
3,665ms
Duck
38ms
Expression aggregate
mskql
33ms
PG
323ms
Duck
249ms
Group + sum
mskql
35ms
PG
284ms
Duck
199ms
Bulk insert
mskql
298ms
PG
851ms
Duck
1,794ms
Analytical CTE
mskql
20ms
PG
78ms
Duck
32ms
Parquet aggregate
mskql
26ms
Duck
73ms
Full table scan
mskql
294ms
PG
439ms
Duck
68ms
Large sort
mskql
234ms
PG
359ms
Duck
50ms

Concurrent throughput — longer bars = more queries/sec (mskql vs PG only)

mskql PostgreSQL
Mixed analytical
mskql
94,525 QPS
PG
496 QPS
Multi-join
mskql
36,171 QPS
PG
356 QPS
Group + sum
mskql
95,020 QPS
PG
10,140 QPS
Indexed point lookup
mskql
111,437 QPS
PG
23,256 QPS

→ All 93 batch + 15 throughput results  ·  Historical charts

PostgreSQL wire-protocol compatible

The server implements PostgreSQL wire protocol v3: Simple Query, Extended Query (prepared statements, $1/$2 parameters), and COPY. Connect with psql, pgAdmin, DBeaver, or any language driver. Default port 5433.

make && ./build/mskql                         # build & start
psql -h 127.0.0.1 -p 5433 -U test -d mskql   # connect

Query Parquet files with standard SQL

CREATE FOREIGN TABLE maps a Parquet file to a SQL table. No ETL, no data loading—query directly with joins, aggregates, and subqueries. The same result cache that accelerates in-memory queries applies to Parquet scans.

CREATE FOREIGN TABLE events
    OPTIONS (filename '/data/events.parquet');

SELECT event_type, COUNT(*), SUM(amount)
FROM events
GROUP BY event_type;

9 Parquet workloads (50K–200K rows). On cached queries, mskql is faster than DuckDB on all 9: aggregates 26ms vs 73ms (2.8×), subqueries 18ms vs 41ms (2.2×), joins 23ms vs 33ms. DuckDB wins no-cache full scans: 31ms vs 485ms (16×).

→ All 18 Parquet benchmark results

Native vector columns

VECTOR(n) is a first-class column type—fixed-dimension float arrays stored as contiguous float[] in columnar blocks (n × 4 bytes per row). Uses pgvector-compatible [1.0, 2.0, 3.0] literal syntax. Insert, scan, filter, and sort tables with vector columns through standard SQL.

CREATE TABLE items (
    id    INT PRIMARY KEY,
    name  TEXT,
    embed VECTOR(3)
);

INSERT INTO items VALUES
    (1, 'cat',    '[0.1, 0.9, 0.3]'),
    (2, 'dog',    '[0.2, 0.8, 0.4]'),
    (3, 'fish',   '[0.9, 0.1, 0.7]'),
    (4, 'parrot', '[0.3, 0.7, 0.5]');

SELECT name, embed FROM items WHERE id > 1 ORDER BY name;

7 vector benchmarks against PostgreSQL + pgvector on the same machine. Wins all 7: insert 3×, scan 1.8×, wide scan 2.7×, filter 3.5× faster. Even without the result cache (_nc variants), faster on every vector workload.

→ All 7 vector benchmark results

Construction methodology

1,375+ SQL test cases. Every one runs under AddressSanitizer and must produce bit-identical output to PostgreSQL. Three agents worked in adversarial rounds: Challenger writes tests designed to break the system, Reviewer flags structural code issues, Writer fixes until the full suite passes. The adversarial model enforces correctness the same way rigorous code review does.

→ Read more about the process

Challenger adversarial tests Writer writes & fixes Reviewer code review iterate until all 1,375+ tests pass

Try it — right here

A fully functional mskql database running entirely in your browser via WebAssembly—no server, no network requests. Choose an example or write your own SQL.

Loading WebAssembly…
Run a query to see results here.

→ Open full playground  (more space, more examples)

Build & connect

Ten queries from CREATE TABLE to recursive CTEs, all through psql.

Getting started →

Read the source

~45,100 lines of C11. Every subsystem in a single file.

GitHub →

Reference

Architecture — wire protocol, parser, vectorized executor, storage. One .c file per subsystem.
Design philosophy — why minimal abstraction leads to faster, more debuggable systems.
SQL reference — DDL, DML, joins, aggregation, window functions, CTEs, subqueries, 30+ built-in functions.  Grammar
Testing — 1,375+ test cases, every one under a memory-safety checker. Parallel execution across all cores.
Benchmarks — mskql vs PostgreSQL and DuckDB on 93 batch workloads and 15 throughput workloads.  Historical charts