All benchmarks, all workloads. This was not a single-query optimization but a structural change that affected every read and write path in the engine.
The engine had two storage representations living side by side:
t->rows — an array of
struct row, where each row held a dynamic array of
struct cell values. This was the original storage format,
used by the legacy row-by-row executor (query.c).col_block — contiguous typed arrays
(int32_t[], double[], uint8_t[]
null bitmaps) used by the columnar plan executor (plan.c).
These were populated by scanning the row store.Every query paid the cost of both representations. Inserts wrote to the row store; the columnar executor then scanned the row store to build column blocks. Deletes had to update both. The row store was the authoritative copy, and the column blocks were ephemeral views rebuilt on each query.
The row store predated the columnar executor. When the plan executor was added, it was built as a layer on top of the existing row store rather than replacing it. This was the right incremental approach—it allowed the new executor to be tested against the old one—but it left the system with two representations that had to be kept in sync.
Removed t->rows entirely. Tables now store data exclusively
in struct flat_table: one contiguous typed array per column,
plus a parallel null bitmap. The key changes:
flat_snap / flat_snap_free
— when the legacy row-by-row executor (query.c)
needs row-oriented access, flat_snap creates a temporary
snapshot of struct row values from the flat arrays. This
is the reverse of the old direction: previously, column blocks were
built from rows; now, rows are built from columns when needed.malloc. In the flat store,
TEXT columns use a char** array where each string is
individually allocated. The flat table’s free path walks the
string array and frees each entry.A single storage representation simplifies every code path:
The legacy row-by-row executor still works via flat_snap,
so queries that fall back from the plan executor to the legacy path
(e.g. complex correlated subqueries) continue to produce correct results.
All 1,514 tests pass under AddressSanitizer.