Quick terms: DataFrame means table, and TimeFrame means a table with a designated time column.
Reference
Ibex function reference
This page lists built-in Ibex functions and standard I/O library functions.
For most operations, prefer the Ibex syntax shown below (for example
df1 join df2 on key) over direct helper function calls.
Ibex syntax
How you usually write these
a join b on key | Inner join (equivalent to inner_join(a, b, key)). |
a left join b on key | Left join (equivalent to left_join(a, b, key)). |
a right join b on key | Right join (equivalent to right_join(a, b, key)). |
a outer join b on key | Outer join (equivalent to outer_join(a, b, key)). |
a semi join b on key | Semi join (equivalent to semi_join(a, b, key)). |
a anti join b on key | Anti join (equivalent to anti_join(a, b, key)). |
df[select { x = sum(col) }, by key] | Standard aggregate usage. |
tf[window 5m, update { x = rolling_mean(price) }] | Typical rolling-window usage. |
df[update { x = fill_forward(x) }] | Typical null-fill usage. |
df[order key] | Standard ordering syntax (instead of order(df, key)). |
import "csv"; read_csv("file.csv") | Typical file I/O usage. |
df[cov] / df[corr] | Covariance or correlation matrix of numeric columns. |
df[transpose] | Swap rows and columns (homogeneous column types required). |
matmul(a, b) | Matrix multiply two DataFrames. |
Core
Table and join functions
Core table functions
as_timeframe(df, "time_col") | Convert a table to a time-indexed table using time_col. |
scalar(df, col) | Extract one value from a one-row table. |
order(df, key1, ...) | Return a table ordered by keys. Usually written as df[order key1, ...]. |
print(value) | Print a human-readable value in scripts and REPL sessions. |
Join functions
inner_join(left, right, key1, ...) | Keep rows with matching keys. Ibex syntax: df1 join df2 on key1. |
left_join(left, right, key1, ...) | Keep all left rows, attach right matches when present. Ibex syntax: df1 left join df2 on key1. |
right_join(left, right, key1, ...) | Keep all right rows, attach left matches when present. Ibex syntax: df1 right join df2 on key1. |
outer_join(left, right, key1, ...) | Keep all rows from both sides. Ibex syntax: df1 outer join df2 on key1. |
semi_join(left, right, key1, ...) | Keep left rows with a right-side match. Ibex syntax: df1 semi join df2 on key1. |
anti_join(left, right, key1, ...) | Keep left rows with no right-side match. Ibex syntax: df1 anti join df2 on key1. |
cross_join(left, right) | Cartesian product. Ibex syntax: df1 cross join df2. |
asof_join(left, right, key1, ...) | Nearest-in-time join for time-indexed tables. Ibex syntax: tf1 asof join tf2 on key1. |
Analytics
Aggregates and time functions
Aggregate functions
Use these inside select { ... } (often with by), e.g. df[select { x = sum(col) }, by key].
sum(col) | Sum of non-null values. |
mean(col) | Arithmetic mean of non-null values. |
min(col) | Minimum non-null value. |
max(col) | Maximum non-null value. |
count() | Row count for the current group/window. |
first(col) | First non-null value in order. |
last(col) | Last non-null value in order. |
median(col) | Median of non-null values. |
std(col) | Sample standard deviation (n-1 denominator). |
ewma(col, alpha) | Exponentially weighted moving average. |
quantile(col, p) | p-quantile with linear interpolation. |
skew(col) | Sample skewness. |
kurtosis(col) | Sample excess kurtosis. |
Window and cumulative functions
Rolling functions are usually used with window on a time-indexed table; cumulative functions also work in plain update/select.
rolling_count() | Row count in the active window. |
rolling_sum(col) | Rolling sum over the active window. |
rolling_mean(col) | Rolling mean over the active window. |
rolling_min(col) | Rolling minimum over the active window. |
rolling_max(col) | Rolling maximum over the active window. |
rolling_median(col) | Rolling median over the active window. |
rolling_std(col) | Rolling sample standard deviation. |
rolling_ewma(col, alpha) | Rolling EWMA within each window. |
rolling_quantile(col, p) | Rolling quantile within each window. |
rolling_skew(col) | Rolling sample skewness. |
rolling_kurtosis(col) | Rolling sample excess kurtosis. |
lag(col, n) | Shift backward by n rows (previous values). |
lead(col, n) | Shift forward by n rows (next values). |
cumsum(col) | Prefix sum, one output per row. |
cumprod(col) | Prefix product, one output per row. |
Transforms
Missing data, sequence, and randomness
Null and sequence functions
Most often used in update { ... }, e.g. df[update { x = fill_forward(x) }].
fill_null(col, value) | Replace nulls in col with a constant value. |
fill_forward(col) | Fill missing values using the last earlier non-missing value. |
fill_backward(col) | Fill missing values using the next later non-missing value. |
rep(x, times=1, each=1, length_out=-1) | Repeat or cycle a value/column to build a full output column. |
Vectorized RNG functions
Used in expressions like df[update { noise = rand_normal(0.0, 1.0) }].
rand_uniform(low, high) | Uniform float draws in [low, high). |
rand_normal(mean, stddev) | Normal-distributed float draws. |
rand_student_t(df) | Student-t float draws with df degrees of freedom. |
rand_gamma(shape, scale) | Gamma-distributed float draws. |
rand_exponential(lambda) | Exponential float draws with rate lambda. |
rand_bernoulli(p) | Bernoulli draws as Int (0/1). |
rand_poisson(lambda) | Poisson draws as Int. |
rand_int(lo, hi) | Uniform integer draws in [lo, hi]. |
Scalar
Scalar helpers and casts
Scalar/date functions
Use in expressions, filters, and updates, e.g. df[filter year(ts) = 2025].
abs(x) | Absolute value. |
log(x) | Natural logarithm. |
sqrt(x) | Square root. |
year(t) | Year component from Date/Timestamp. |
month(t) | Month component from Date/Timestamp. |
day(t) | Day-of-month component from Date/Timestamp. |
hour(t) | Hour component from Timestamp. |
minute(t) | Minute component from Timestamp. |
second(t) | Second component from Timestamp. |
round(x, mode) | Round Float to Int with mode: nearest, bankers, floor, ceil, or trunc. |
Cast constructors
These are regular call syntax in Ibex, used in expressions and updates.
Int64(x) | Explicit cast to 64-bit integer. |
Int32(x) | Explicit cast to 32-bit integer. |
Int(x) | Alias for Int64(x). |
Float64(x) | Explicit cast to 64-bit float. |
Float32(x) | Explicit cast to 32-bit float. |
Matrix
Matrix operations
These treat a DataFrame as a column-major matrix. Non-numeric columns are silently dropped for cov, corr, and matmul; Int64 columns are widened to Float64. transpose requires all data columns to share the same type.
df[cov] | Sample covariance matrix of all numeric columns. Returns an N×N Float64 table with a leading column: String label column. Denominator is n−1. |
df[corr] | Pearson correlation matrix of all numeric columns. Same schema as cov; diagonal values are exactly 1.0. |
df[transpose] | Swap rows and columns. All data columns must share the same type. An optional String or Categorical column is used to name output columns; if absent, columns are named r0, r1, … |
matmul(a, b) | Matrix multiply two DataFrames. Inner dimensions must match. Output column names come from b; row count equals nrow(a). |
Examples
prices[select { open, high, low, close }][cov] | 4×4 covariance matrix of OHLC columns. |
prices[select { open, close }][corr] | 2×2 correlation matrix; off-diagonal is the open/close correlation. |
prices[select { symbol, open, close }][transpose] | Transpose with symbol values as output column names. |
matmul(returns[select { open, close }], weights) | Multiply a returns matrix by a weights column — typical portfolio aggregation. |
Model
Model specification
The model clause fits a regression using R-style formula syntax. Numeric columns pass through to the design matrix; String columns are dummy-encoded (treatment coding). The result is a ModelResult — an opaque type accessed via the functions below.
df[model { y ~ x1 + x2 }] | OLS regression of y on x1 and x2 with intercept. |
df[model { y ~ . }] | Regress y on all other columns (dot notation). |
df[model { y ~ x - 1 }] | No intercept — suppress the constant term. |
df[model { y ~ x1 * x2 }] | Crossing: expands to x1 + x2 + x1:x2. |
df[model { y ~ x1 + x2, method = ridge, lambda = 0.1 }] | Ridge regression with L2 penalty lambda. |
df[model { y ~ x, method = wls, weights = w }] | Weighted least squares using column w as weights. |
Accessor functions
model_coef(m) | Coefficient table with columns term: String and estimate: Float64. |
model_summary(m) | Full summary: term, estimate, std_error, t_stat, p_value. |
model_fitted(m) | Fitted values (ŷ) as a single-column table. |
model_residuals(m) | Residuals (y − ŷ) as a single-column table. |
model_r_squared(m) | R² and adjusted R² as a single-row table. |
Examples
prices[model { close ~ open + volume }] | Simple OLS of closing price on open and volume. |
prices[filter volume > 1000000, model { close ~ open + high + low }] | Filtered regression — only fit on high-volume rows. |
prices[model { close ~ open * volume, method = ridge, lambda = 0.5 }] | Ridge with main effects and interaction term. |
I/O Libraries
File and stream functions shipped with Ibex
For a focused CSV / Parquet / SQLite / Kafka guide with examples and usage notes, see the I/O page.
CSV and JSON
Typical workflow: import "csv" or import "json", then call these directly.
read_csv(path) | Load a CSV file into a DataFrame. |
write_csv(df, path) | Write a DataFrame as CSV and return row count. |
read_json(path) | Load JSON (array, object, or JSON-lines) into a DataFrame. |
write_json(df, path) | Write a DataFrame to JSON and return row count. |
Parquet and stream I/O
Use parquet for batch files; UDP, WebSocket, and Kafka functions are commonly used in Stream { ... } pipelines.
read_parquet(path) | Load Apache Parquet into a DataFrame. |
write_parquet(df, path) | Write a DataFrame to Parquet and return row count. |
kafka_recv(brokers, topic, group, schema[, options]) | Poll one JSON Kafka message, decode it with an explicit schema, and return a one-row DataFrame or StreamTimeout. |
kafka_send(df, brokers, topic[, options]) | Serialize each DataFrame row to one JSON Kafka message and return sent-row count. |
udp_recv(port) | Read rows from UDP into a DataFrame batch. |
udp_send(df, host, port) | Send a DataFrame batch via UDP and return sent-row count. |
ws_listen(port) | Start a WebSocket listener. |
ws_recv(port) | Receive WebSocket messages as DataFrame batches. |
ws_send(df, port) | Broadcast a DataFrame batch to connected WebSocket clients. |