Testing

1,514 tests · two layers of verification · zero tolerance for leaks

1,514 tests. Every one runs under AddressSanitizer. A correct query that leaks 16 bytes is a failure. Tests run in parallel across all CPU cores with persistent server instances.

Two layers of verification

Every make test run validates two independent properties for every test case. A test passes only if both layers are green. A query that returns the correct result but leaks 16 bytes is a failure.

Layer	What it catches	Mechanism
Functional correctness	Wrong output, missing rows, wrong types, parse errors	SQL output compared to expected results via `diff`
Memory safety	Leaks, use-after-free, buffer overflows, double-free	AddressSanitizer + LeakSanitizer enabled by default (`-fsanitize=address`)

Test runner architecture

The test runner (tests/test.sh, 571 lines) orchestrates everything. No test framework—just shell scripts and diff.

Persistent servers with database reset

Each worker starts one persistent server process and reuses it for all tests in its partition. Between tests, SELECT __reset_db() drops all tables, types, and sequences—returning the database to a clean state without the overhead of process start/stop. If a reset fails (e.g. server crash), the worker automatically restarts its server. This eliminates ~1,300 server start/stop cycles, reducing wall-clock time from ~87s to ~17s.

Batched psql sessions

All setup SQL for a test is piped through a single psql session, and all input SQL through another. Semicolons are normalized (sed 's/;*$/;/') to ensure correct batching. This reduces psql invocations from ~3,300 to ~2,700 per full run.

ASAN / LeakSanitizer by default

The default build (src/Makefile) compiles with -fsanitize=address -fno-omit-frame-pointer. The test runner sets ASAN_OPTIONS="detect_leaks=1:log_path=..." and LSAN_OPTIONS="suppressions=..." per server instance. ASAN checking is done once per worker at shutdown (not per-test), since the persistent server accumulates all leak data across its lifetime. Any LeakSanitizer or ERROR: AddressSanitizer hit fails the entire worker’s test batch.

Suppressions for system false positives

lsan_suppressions.txt whitelists known macOS system library leaks that are not mskql code:

# LeakSanitizer suppressions for macOS system libraries (false positives)
leak:_fetchInitializingClassList
leak:_libxpc_initializer
leak:libSystem_initializer
leak:initializeNonMetaClass
leak:dyld::ThreadLocalVariables
leak:_tlv_get_addr
# macOS libc dtoa thread-local caches (allocated by snprintf %g/%f, never freed)
leak:__Balloc_D2A
# macOS libc localtime thread-local buffer (allocated once, never freed)
leak:localtime

Parallel execution across all CPU cores

Tests run N-wide (auto-detected via nproc / sysctl). Each worker gets a unique port (BASE_PORT + slot_index). Tests are round-robin distributed across worker manifest files; each worker runs its batch sequentially on its persistent server. The main process polls for result files every 100ms. Wall-clock time is proportional to the slowest worker partition, not the sum.

Transaction-aware execution

The runner detects BEGIN / COMMIT / ROLLBACK in setup or input SQL and switches from per-statement execution to single-session piped execution—necessary for transaction tests to work correctly.

Declarative test format

Each .sql file is self-contained. No fixtures, no setup files, no test framework:

-- adversarial: ALTER TABLE ADD COLUMN then SELECT
-- setup:
CREATE TABLE t_aac (id INT, name TEXT);
INSERT INTO t_aac VALUES (1, 'alice');
INSERT INTO t_aac VALUES (2, 'bob');
ALTER TABLE t_aac ADD COLUMN age INT;
-- input:
SELECT id, name, age FROM t_aac ORDER BY id;
-- expected output:
1|alice|
2|bob|

The format supports four sections:

Section	Required?	Purpose
`-- <test name>`	Yes	First comment line; used in pass/fail reporting
`-- setup:`	Optional	SQL run before the test; output not checked. Use for CREATE TABLE, INSERT, etc.
`-- input:`	Yes	SQL whose output is checked against expected
`-- expected output:`	Yes	Expected lines, compared with `psql -tA` output

C-level protocol test suites

Two additional test suites go beyond SQL, exercising the wire protocol directly with custom C clients:

Suite	Source	What it tests
Extended Query Protocol	`test_extended.c`	Prepared statements, portals, `$1`/`$2` parameter binding, error state handling, Sync/Flush semantics—speaking raw pgwire binary protocol
Concurrency	`test_concurrent.c`	Multiple simultaneous TCP connections, rapid connect/disconnect, interleaved queries, state isolation between clients

These are compiled from tests/cases/*/Makefile and run after the SQL suite. Each reports individual check counts (“All N tests passed”).

What is tested

The 1,514 test cases cover DDL (IF NOT EXISTS, CHECK constraints, CREATE TABLE LIKE, composite indexes), DML, joins (NATURAL, USING, multi-table), aggregation (including expression aggregates, positional GROUP BY, STRING_AGG(), and ARRAY_AGG()), window functions (including frames), set operations, CTEs, transactions (including nested BEGIN), NULL handling, type coercion, CAST/:: conversions, constraint enforcement, foreign keys (CASCADE, RESTRICT, SET NULL, SET DEFAULT), sequences, views, SMALLINT type, native temporal types (DATE, TIME, TIMESTAMP, INTERVAL), binary UUID, Parquet foreign tables, EXPLAIN, system catalog queries (information_schema, pg_catalog), SET/SHOW/DISCARD, math functions, string functions, date/time arithmetic, temporal functions, expression evaluation, TRUNCATE TABLE, COPY TO/FROM, IS [NOT] DISTINCT FROM, ORDER BY expressions, INSERT...SELECT with CTEs, generate_series(), upserts (ON CONFLICT with EXCLUDED.*), correlated subqueries, multi-column radix sort, error message propagation, and various edge cases.

Running the tests

make test                          # full suite: build with ASAN, run all 1,514 tests
MSKQL_NO_LEAK_CHECK=1 make test   # skip leak checking (faster, less strict)

Why this matters

Arena allocation eliminates use-after-free and leak classes by construction. AddressSanitizer catches the rest. The result: zero known memory bugs across 1,514 adversarial test cases.

Explore further

How the tests were written · Architecture · Benchmarks · Source on GitHub