Every SQL statement, clause, and data type supported by mskql. For the formal grammar, see the Grammar Reference.
| Category | Supported | Notes |
|---|---|---|
| DDL | CREATE TABLE (IF NOT EXISTS, LIKE),
DROP TABLE,
CREATE INDEX (IF NOT EXISTS, composite multi-column),
DROP INDEX,
CREATE TYPE … AS ENUM, DROP TYPE,
CREATE SEQUENCE, DROP SEQUENCE,
CREATE VIEW, DROP VIEW,
ALTER TABLE (ADD/DROP/RENAME COLUMN,
ALTER COLUMN TYPE,
SET DEFAULT/DROP DEFAULT,
SET NOT NULL/DROP NOT NULL),
composite PRIMARY KEY(col1, col2) / UNIQUE(col1, col2) |
Schema changes are immediate; no background rewrite. Views are expanded at query time. |
| DML | INSERT (single, multi-row, INSERT…SELECT),
SELECT, UPDATE (incl. UPDATE…FROM),
DELETE,
RETURNING,
ON CONFLICT DO NOTHING,
ON CONFLICT DO UPDATE (with EXCLUDED.* references),
TRUNCATE TABLE,
COPY TO STDOUT (tab/CSV/header),
COPY FROM STDIN (tab/CSV) |
In-place updates—no MVCC tuple versioning. Inserts are 3× faster than PostgreSQL. |
| Filtering | WHERE with AND/OR/NOT,
BETWEEN, IN (values, subqueries, multi-column),
ANY/ALL,
LIKE, ILIKE,
IS [NOT] NULL, IS [NOT] DISTINCT FROM,
= != <>
< > <= >= |
Compound filters (AND/OR trees) run in the plan executor on columnar blocks. |
| Joins | INNER, LEFT, RIGHT,
FULL OUTER, CROSS,
NATURAL, USING,
LATERAL, multi-table,
compound ON conditions (AND/OR) |
Hash join for equi-joins; nested loop for CROSS/LATERAL. Join cache reuses hash tables across queries. |
| Aggregation | COUNT, SUM, AVG,
MIN, MAX,
COUNT(DISTINCT col),
STRING_AGG(), ARRAY_AGG(),
expression aggregates (SUM(price * qty)),
GROUP BY (multi-column, positional: GROUP BY 1, 2),
GROUP BY ROLLUP, GROUP BY CUBE,
HAVING |
Columnar hash aggregation—6–10× faster than PostgreSQL on typical workloads. |
| Window functions | ROW_NUMBER(), RANK(), DENSE_RANK(),
NTILE(), PERCENT_RANK(), CUME_DIST(),
LAG(), LEAD(),
FIRST_VALUE(), LAST_VALUE(), NTH_VALUE(),
SUM(), COUNT(), AVG()
over PARTITION BY … ORDER BY …,
window frames (ROWS BETWEEN …) |
Sort-based window execution with bump-allocated accumulators. ~2× faster than PostgreSQL. |
| Set operations | UNION, UNION ALL,
INTERSECT, EXCEPT |
UNION deduplicates via hash; UNION ALL concatenates without dedup. |
| CTEs | WITH name AS (SELECT …) SELECT …,
WITH RECURSIVE |
Simple CTEs are inlined when possible, avoiding temp-table materialization overhead. |
| Expressions | + - * / %,
unary -, operator precedence,
CASE WHEN…THEN…ELSE…END,
COALESCE(), NULLIF(),
GREATEST(), LEAST() |
Evaluated in columnar batches via PLAN_EXPR_PROJECT—35% faster than PostgreSQL. |
| Result control | DISTINCT, DISTINCT ON,
ORDER BY (multi-column, per-column ASC/DESC, positional),
LIMIT, OFFSET,
FETCH FIRST N ROWS ONLY |
LIMIT short-circuits the plan—stops pulling blocks once satisfied. |
| Constraints | NOT NULL, UNIQUE, PRIMARY KEY,
DEFAULT, CHECK,
REFERENCES (foreign keys with
ON DELETE/UPDATE CASCADE,
RESTRICT, SET NULL,
SET DEFAULT) |
Enforced at write time. Foreign keys support all four referential actions. |
| Transactions | BEGIN, COMMIT, ROLLBACK
(snapshot-based) |
Lazy copy-on-write snapshots. First write to a table triggers O(N) deep copy. |
| String functions | UPPER(), LOWER(), LENGTH(),
TRIM(), CONCAT(), CONCAT_WS(),
||, SUBSTRING(),
REPLACE(), LPAD()/RPAD(),
POSITION(), SPLIT_PART(),
LEFT()/RIGHT(),
REPEAT(), REVERSE(), INITCAP(),
LTRIM(), RTRIM(), SUBSTR() |
19 string functions. All run in the expression projector for batched evaluation. |
| Math functions | ABS(), CEIL()/CEILING(),
FLOOR(), ROUND() (with optional decimal places),
POWER(), SQRT(), MOD(),
SIGN(), RANDOM() |
9 math functions. ROUND(expr, n) supports arbitrary decimal places. |
| Date/time functions | NOW(), CURRENT_TIMESTAMP, CURRENT_DATE,
EXTRACT(), DATE_TRUNC(), DATE_PART(),
AGE(), TO_CHAR(),
date/timestamp ± interval arithmetic,
timestamp − timestamp → interval,
date ± integer days |
Full interval arithmetic. TO_CHAR supports common PostgreSQL format patterns. |
| Type casting | CAST(expr AS type), expr::type postfix syntax,
numeric↔numeric, numeric↔text, text↔boolean,
text→date/timestamp conversions |
Both CAST() and :: syntax. Implicit coercion for compatible types. |
| Table functions | generate_series(start, stop [, step]) — integer,
timestamp, and date sequences with optional step and column aliases |
Supports integer, date, and timestamp series. Max 10M rows safety limit. |
| Sequences & utilities | NEXTVAL(), CURRVAL(),
GEN_RANDOM_UUID() |
SERIAL columns use sequences internally. UUIDs are v4 random. |
| Foreign tables | CREATE FOREIGN TABLE name (col type, …) OPTIONS (filename ’path.parquet’)
— maps a Parquet file to a SQL table. Query with standard SQL:
joins, aggregates, subqueries, CTEs. Result cache applies.
DROP FOREIGN TABLE |
Parquet files queried on demand. Cached queries are 2–14× faster than DuckDB. |
| Subqueries | Scalar subqueries in SELECT list,
correlated subqueries,
EXISTS / NOT EXISTS (correlated),
IN (SELECT…) |
IN subqueries use hash semi-join (O(N+M)). Correlated subqueries: 166× faster than PG. |
| Introspection | EXPLAIN (plan-based and legacy execution paths),
SHOW (server_version, search_path,
client_encoding, etc.),
SET/RESET/DISCARD (accepted, no-op),
information_schema.tables,
information_schema.columns,
pg_catalog.pg_type,
pg_catalog.pg_class,
pg_catalog.pg_namespace,
pg_catalog.pg_attribute,
pg_catalog.pg_index,
pg_catalog.pg_attrdef,
pg_catalog.pg_constraint,
pg_catalog.pg_am,
pg_catalog.pg_roles |
EXPLAIN shows the plan tree. System catalogs enable tool compatibility (e.g. DBeaver, pgAdmin). |
→ Full SQL grammar reference — production rules for every statement, with implemented/not-implemented status for each feature.
| Type | Storage | Description |
|---|---|---|
SMALLINT / INT2 | 2 bytes | 16-bit signed integer (−32,768 to 32,767) |
INT | 4 bytes | 32-bit signed integer. Default for most integer columns. |
BIGINT | 8 bytes | 64-bit signed integer. Use for row counts or large IDs. |
FLOAT | 8 bytes | Double-precision (C double). Use for scientific or approximate values. |
NUMERIC / DECIMAL | 8 bytes | Stored as double. Use for monetary or exact decimal values. |
TEXT | variable | Variable-length string. No length limit. |
VARCHAR(n) | variable | String with max length n. Stored as TEXT internally. |
BOOLEAN | 1 byte | TRUE / FALSE. Accepts t/f, yes/no, 1/0. |
DATE | 4 bytes | Calendar date (YYYY-MM-DD). Stored as int32_t days from PostgreSQL epoch (2000-01-01). |
TIME | 8 bytes | Time of day (HH:MM:SS). Stored as int64_t microseconds since midnight. |
TIMESTAMP | 8 bytes | Date and time (YYYY-MM-DD HH:MM:SS). Stored as int64_t microseconds from PG epoch. Supports arithmetic with intervals. |
TIMESTAMPTZ | 8 bytes | Timestamp with time zone awareness. Same storage as TIMESTAMP. |
INTERVAL | 16 bytes | Time span (months int32 + days int32 + usec int64). Supports '1 year 2 months 3 days' syntax. |
UUID | 16 bytes | 128-bit identifier stored as two uint64. Generated with GEN_RANDOM_UUID(). |
SMALLSERIAL / SERIAL / BIGSERIAL | 2/4/8 bytes | Auto-incrementing integer backed by a sequence. |
VECTOR(n) | n × 4 bytes | Fixed-dimension float array. pgvector-compatible [1.0, 2.0, 3.0] literal syntax. Stored as contiguous float[] in columnar blocks. |
ENUM | 4 bytes | User-defined type with ordinal storage. Compared by declaration order. |