Ibex

Typed table pipelines
that compile to C++23.

Ibex is a statically typed language for columnar DataFrame and TimeFrame manipulation. Write concise bracket-pipeline queries — filter, aggregate, roll, resample, join — and run them in the fast interpreter or compile them to standalone C++ binaries.

1.3× faster than Polars on aggregation 5× faster than Pandas 10–20× faster on rolling windows

View on GitHub Get started ↓

Concise bracket syntax

Pipeline operations chain left-to-right inside [ ], in the order they execute. No SQL keywords, no macro magic, no implicit coercions. Backtick-quoted names handle columns with dots or spaces.

TimeFrame-aware

as_timeframe promotes a DataFrame to a time-indexed structure with O(n) sort detection. Rolling windows use a two-pointer O(n) scan with a single result-column allocation per call — no copies, no heap churn.

C++23 codegen

ibex_compile transpiles any .ibex script to idiomatic C++ using the ibex::ops::* library. Compiled and interpreted outputs are behaviour-equivalent; both run near peak throughput.

Language tour

Everything you need, nothing you don’t.

A complete analytical pipeline in a handful of composable clauses.

Load & Filter

External functions bring data in.

extern fn declares a C++ data-source function as a first-class Ibex binding. The compiler resolves it at link time; the REPL loads the corresponding plugin .so at runtime.

Filter expressions support arithmetic, comparisons, and boolean logic (and, or, not). Multiple clauses chain in reading order.

extern fn read_csv(path: String) -> DataFrame
    from "csv.hpp";

let prices = read_csv("prices.csv");

// Keep rows where price exceeds 1.0
let active = prices[filter price > 1.0];

// Chain filter with a column projection
prices[
    filter price > 1.0,
    select { symbol, price, volume }
];

Aggregation

One clause for projection and grouped reduction.

select doubles as a projection and an aggregation clause. Add by to group; omit it for a global aggregate.

Available aggregate functions: first, last, sum, mean, min, max, count.

// Mean sepal length and row count, by species
iris[select {
    mean_sl = mean(`Sepal.Length`),
    n       = count()
}, by Species];

// OHLC per symbol — all in one pass
prices[select {
    open   = first(price),
    high   = max(price),
    low    = min(price),
    close  = last(price),
    traded = sum(volume)
}, by symbol];

Derived Columns

Add or replace columns without losing any data.

update appends named computed columns to every row. All existing columns pass through untouched, so the result schema is a strict superset of the input.

// Enrich every row with return, range, and notional
ohlcv[update {
    ret      = (close - open) / open,
    range    = (high  - low)  / open,
    notional = close * volume
}];

// Chain an update with an aggregation
let daily = ohlcv[update { ret = (close - open) / open }];
daily[select {
    avg_ret    = mean(ret),
    n_sessions = count()
}, by sector]
    [order { avg_ret desc }];

Order & Distinct

Sort and deduplicate with explicit intent.

order accepts a single column, a multi-key block with asc / desc annotations, or no argument at all to sort by every column in schema order.

distinct deduplicates on a single column or a set.

// Sort by a single key (ascending by default)
iris[order `Sepal.Length`];

// Multi-key sort with explicit directions
results[order { avg_ret desc, symbol asc }];

// Sort by all columns in schema order
iris[order];

// Unique species names
iris[distinct Species];

// Unique (species, length) pairs
iris[distinct { Species, `Sepal.Length` }];

Joins

Inner, left, and as-of — written as natural prose.

The as-of join attaches the latest right row at-or-before each left timestamp. It is the standard pattern for enriching tick data with bar data, without look-ahead bias.

Join keys are named with on. Both tables must be TimeFrames for an asof join.

// Inner join — drop non-matching rows
let enriched = daily join fund on symbol;

// Left join — preserve all left rows
let with_meta = prices left join metadata on symbol;

// As-of join — each tick gets the latest bar at or before ts
let tf   = as_timeframe(ticks,  "ts");
let bars = as_timeframe(bars_1m, "ts");
tf asof join bars on ts;

Rolling Windows

Time-based rolling aggregates in one O(n) pass.

as_timeframe validates sort order in O(n) and records the time-index column. window <dur> sets the lookback; rolling functions (rolling_sum, rolling_mean, rolling_count, lag) use a two-pointer scan with no per-row heap allocation.

Duration literals: 1s, 30s, 1m, 5m, 1h, …

let tf = as_timeframe(ticks, "ts");

// Previous tick's price
tf[update { prev_price = lag(price, 1) }];

// Tick count in the last 60 seconds
tf[window 1m, update { ticks_1m = rolling_count() }];

// Multiple rolling aggregates in one pass
tf[window 5m, update {
    sum_5m  = rolling_sum(price),
    mean_5m = rolling_mean(price)
}];

Resample

Aggregate ticks into equal-width time buckets.

resample <dur> floors timestamps into fixed-width intervals and reduces each bucket to one output row. Combine with by for per-symbol bars.

The output TimeFrame carries the bucket start time as its time index, ready for downstream joins or further resampling.

let tf = as_timeframe(ticks, "ts");

// 1-minute OHLC bars
let bars = tf[resample 1m, select {
    open  = first(price),
    high  = max(price),
    low   = min(price),
    close = last(price)
}];

// Per-symbol 1-minute bars
tf[resample 1m, select {
    open  = first(price),
    close = last(price)
}, by symbol];

// Enrich ticks with the latest bar's close
tf asof join bars on ts;

Scalar & Codegen

Extract single values and compile to C++.

scalar pulls one typed value out of a single-cell result table. It is available as a binding in subsequent expressions.

ibex_compile transpiles a .ibex file to a self-contained C++ source file. The helper script compiles and links it in one step.

// Pull a single value from an aggregate
let total = scalar(
    prices[select { total = sum(price) }],
    total
);

// Use it in subsequent expressions
prices[update { weight = price / total }];

# Compile and run in one step
scripts/ibex-run.sh examples/quant.ibex

Performance

Fast by design, not by accident.

Release build, clang++, WSL2. Polars and data.table run multi-threaded on all cores; Ibex is single-threaded throughout.

Aggregation — 4 M rows, 252 symbols

Query	Ibex	Polars	Pandas
mean by symbol	28.4 ms	40.1 ms	181 ms
OHLC by symbol	34.9 ms	48.0 ms	249 ms
count by sym×day	12.6 ms	66.2 ms	328 ms
mean by sym×day	14.0 ms	76.8 ms	367 ms
OHLC by sym×day	20.6 ms	73.9 ms	400 ms
filter simple	19.5 ms	8.40 ms	30.7 ms

Geometric mean across 10 queries: 1.3× faster than Polars, 5× faster than Pandas, 2.1× faster than data.table, 3.5× faster than dplyr. Filter queries favour Polars, which uses parallel SIMD scans.

TimeFrame — 1 M rows, 1 s spacing

Operation	Ibex	Polars	data.table
as_timeframe (sort)	0.28 ms	4.78 ms	6.2 ms
lag(price, 1)	0.97 ms	4.84 ms	11.0 ms
rolling count 1m	1.12 ms	16.9 ms	12.2 ms
rolling sum 1m	1.43 ms	19.0 ms	10.9 ms
rolling mean 5m	1.65 ms	19.7 ms	9.6 ms
resample 1m OHLC	24.7 ms	14.6 ms	20.0 ms

Rolling operations use a two-pointer O(n) scan with a single result-column allocation. Resample delegates to the aggregation path and is slower than Polars’ parallel group_by_dynamic on this query.

Editor support

Syntax highlighting for VS Code.

A TextMate grammar covering keywords, types, built-in functions, duration literals, backtick-quoted column names, and comments.

Install — WSL

cp -r editors/vscode \
  /mnt/c/Users/<username>/.vscode/extensions/ibex-language-0.1.0

Install — macOS / native Linux

cp -r editors/vscode \
  ~/.vscode/extensions/ibex-language-0.1.0

Fully restart VS Code after copying. .ibex files are highlighted automatically.

Get started

Build, run, explore.

Requirements: Clang 17+, CMake 3.26+, Ninja.

1 — Clone and build

# Debug build (ASan + UBSan)
cmake -B build -G Ninja \
  -DCMAKE_CXX_COMPILER=clang++ \
  -DCMAKE_BUILD_TYPE=Debug \
  -DIBEX_ENABLE_SANITIZERS=ON
cmake --build build

# Release build
cmake -B build-release -G Ninja \
  -DCMAKE_CXX_COMPILER=clang++ \
  -DCMAKE_BUILD_TYPE=Release
cmake --build build-release

2 — Run the test suite

ctest --test-dir build --output-on-failure

3 — Start the REPL

./build-release/tools/ibex --plugin-path ./build-release/libraries

`:load examples/quant.ibex`	Load and execute an .ibex script
`:tables`	List all bound table names
`:schema <table>`	Column names and types
`:head <table> [n]`	First n rows (default 10)
`:describe <table>`	Schema + first n rows

4 — Compile a script to C++

# Transpile, compile, and run in one step
scripts/ibex-run.sh examples/quant.ibex

# Or transpile only
scripts/ibex-build.sh examples/quant.ibex -o quant

Typed table pipelinesthat compile to C++23.