Reference

Ibex function reference

This page lists built-in Ibex functions and standard I/O library functions. For most operations, prefer the Ibex syntax shown below (for example df1 join df2 on key) over direct helper function calls.

Quick terms: DataFrame means table, and TimeFrame means a table with a designated time column.

How you usually write these

a join b on keyInner join (equivalent to inner_join(a, b, key)).
a left join b on keyLeft join (equivalent to left_join(a, b, key)).
a right join b on keyRight join (equivalent to right_join(a, b, key)).
a outer join b on keyOuter join (equivalent to outer_join(a, b, key)).
a semi join b on keySemi join (equivalent to semi_join(a, b, key)).
a anti join b on keyAnti join (equivalent to anti_join(a, b, key)).
df[select { x = sum(col) }, by key]Standard aggregate usage.
tf[window 5m, update { x = rolling_mean(price) }]Typical rolling-window usage.
df[update { x = fill_forward(x) }]Typical null-fill usage.
df[order key]Standard ordering syntax (instead of order(df, key)).
import "csv"; read_csv("file.csv")Typical file I/O usage.
df[cov] / df[corr]Covariance or correlation matrix of numeric columns.
df[transpose]Swap rows and columns (homogeneous column types required).
matmul(a, b)Matrix multiply two DataFrames.

Table and join functions

Core table functions

as_timeframe(df, "time_col")Convert a table to a time-indexed table using time_col.
scalar(df, col)Extract one value from a one-row table.
order(df, key1, ...)Return a table ordered by keys. Usually written as df[order key1, ...].
print(value)Print a human-readable value in scripts and REPL sessions.

Join functions

inner_join(left, right, key1, ...)Keep rows with matching keys. Ibex syntax: df1 join df2 on key1.
left_join(left, right, key1, ...)Keep all left rows, attach right matches when present. Ibex syntax: df1 left join df2 on key1.
right_join(left, right, key1, ...)Keep all right rows, attach left matches when present. Ibex syntax: df1 right join df2 on key1.
outer_join(left, right, key1, ...)Keep all rows from both sides. Ibex syntax: df1 outer join df2 on key1.
semi_join(left, right, key1, ...)Keep left rows with a right-side match. Ibex syntax: df1 semi join df2 on key1.
anti_join(left, right, key1, ...)Keep left rows with no right-side match. Ibex syntax: df1 anti join df2 on key1.
cross_join(left, right)Cartesian product. Ibex syntax: df1 cross join df2.
asof_join(left, right, key1, ...)Nearest-in-time join for time-indexed tables. Ibex syntax: tf1 asof join tf2 on key1.

Aggregates and time functions

Aggregate functions

Use these inside select { ... } (often with by), e.g. df[select { x = sum(col) }, by key].

sum(col)Sum of non-null values.
mean(col)Arithmetic mean of non-null values.
min(col)Minimum non-null value.
max(col)Maximum non-null value.
count()Row count for the current group/window.
first(col)First non-null value in order.
last(col)Last non-null value in order.
median(col)Median of non-null values.
std(col)Sample standard deviation (n-1 denominator).
ewma(col, alpha)Exponentially weighted moving average.
quantile(col, p)p-quantile with linear interpolation.
skew(col)Sample skewness.
kurtosis(col)Sample excess kurtosis.

Window and cumulative functions

Rolling functions are usually used with window on a time-indexed table; cumulative functions also work in plain update/select.

rolling_count()Row count in the active window.
rolling_sum(col)Rolling sum over the active window.
rolling_mean(col)Rolling mean over the active window.
rolling_min(col)Rolling minimum over the active window.
rolling_max(col)Rolling maximum over the active window.
rolling_median(col)Rolling median over the active window.
rolling_std(col)Rolling sample standard deviation.
rolling_ewma(col, alpha)Rolling EWMA within each window.
rolling_quantile(col, p)Rolling quantile within each window.
rolling_skew(col)Rolling sample skewness.
rolling_kurtosis(col)Rolling sample excess kurtosis.
lag(col, n)Shift backward by n rows (previous values).
lead(col, n)Shift forward by n rows (next values).
cumsum(col)Prefix sum, one output per row.
cumprod(col)Prefix product, one output per row.

Missing data, sequence, and randomness

Null and sequence functions

Most often used in update { ... }, e.g. df[update { x = fill_forward(x) }].

fill_null(col, value)Replace nulls in col with a constant value.
fill_forward(col)Fill missing values using the last earlier non-missing value.
fill_backward(col)Fill missing values using the next later non-missing value.
rep(x, times=1, each=1, length_out=-1)Repeat or cycle a value/column to build a full output column.

Vectorized RNG functions

Used in expressions like df[update { noise = rand_normal(0.0, 1.0) }].

rand_uniform(low, high)Uniform float draws in [low, high).
rand_normal(mean, stddev)Normal-distributed float draws.
rand_student_t(df)Student-t float draws with df degrees of freedom.
rand_gamma(shape, scale)Gamma-distributed float draws.
rand_exponential(lambda)Exponential float draws with rate lambda.
rand_bernoulli(p)Bernoulli draws as Int (0/1).
rand_poisson(lambda)Poisson draws as Int.
rand_int(lo, hi)Uniform integer draws in [lo, hi].

Scalar helpers and casts

Scalar/date functions

Use in expressions, filters, and updates, e.g. df[filter year(ts) = 2025].

abs(x)Absolute value.
log(x)Natural logarithm.
sqrt(x)Square root.
year(t)Year component from Date/Timestamp.
month(t)Month component from Date/Timestamp.
day(t)Day-of-month component from Date/Timestamp.
hour(t)Hour component from Timestamp.
minute(t)Minute component from Timestamp.
second(t)Second component from Timestamp.
round(x, mode)Round Float to Int with mode: nearest, bankers, floor, ceil, or trunc.

Cast constructors

These are regular call syntax in Ibex, used in expressions and updates.

Int64(x)Explicit cast to 64-bit integer.
Int32(x)Explicit cast to 32-bit integer.
Int(x)Alias for Int64(x).
Float64(x)Explicit cast to 64-bit float.
Float32(x)Explicit cast to 32-bit float.

Matrix operations

These treat a DataFrame as a column-major matrix. Non-numeric columns are silently dropped for cov, corr, and matmul; Int64 columns are widened to Float64. transpose requires all data columns to share the same type.

df[cov]Sample covariance matrix of all numeric columns. Returns an N×N Float64 table with a leading column: String label column. Denominator is n−1.
df[corr]Pearson correlation matrix of all numeric columns. Same schema as cov; diagonal values are exactly 1.0.
df[transpose]Swap rows and columns. All data columns must share the same type. An optional String or Categorical column is used to name output columns; if absent, columns are named r0, r1, …
matmul(a, b)Matrix multiply two DataFrames. Inner dimensions must match. Output column names come from b; row count equals nrow(a).

Examples

prices[select { open, high, low, close }][cov]4×4 covariance matrix of OHLC columns.
prices[select { open, close }][corr]2×2 correlation matrix; off-diagonal is the open/close correlation.
prices[select { symbol, open, close }][transpose]Transpose with symbol values as output column names.
matmul(returns[select { open, close }], weights)Multiply a returns matrix by a weights column — typical portfolio aggregation.

Model specification

The model clause fits a regression using R-style formula syntax. Numeric columns pass through to the design matrix; String columns are dummy-encoded (treatment coding). The result is a ModelResult — an opaque type accessed via the functions below.

df[model { y ~ x1 + x2 }]OLS regression of y on x1 and x2 with intercept.
df[model { y ~ . }]Regress y on all other columns (dot notation).
df[model { y ~ x - 1 }]No intercept — suppress the constant term.
df[model { y ~ x1 * x2 }]Crossing: expands to x1 + x2 + x1:x2.
df[model { y ~ x1 + x2, method = ridge, lambda = 0.1 }]Ridge regression with L2 penalty lambda.
df[model { y ~ x, method = wls, weights = w }]Weighted least squares using column w as weights.

Accessor functions

model_coef(m)Coefficient table with columns term: String and estimate: Float64.
model_summary(m)Full summary: term, estimate, std_error, t_stat, p_value.
model_fitted(m)Fitted values (ŷ) as a single-column table.
model_residuals(m)Residuals (y − ŷ) as a single-column table.
model_r_squared(m)R² and adjusted R² as a single-row table.

Examples

prices[model { close ~ open + volume }]Simple OLS of closing price on open and volume.
prices[filter volume > 1000000, model { close ~ open + high + low }]Filtered regression — only fit on high-volume rows.
prices[model { close ~ open * volume, method = ridge, lambda = 0.5 }]Ridge with main effects and interaction term.

File and stream functions shipped with Ibex

For a focused CSV / Parquet / SQLite / Kafka guide with examples and usage notes, see the I/O page.

CSV and JSON

Typical workflow: import "csv" or import "json", then call these directly.

read_csv(path)Load a CSV file into a DataFrame.
write_csv(df, path)Write a DataFrame as CSV and return row count.
read_json(path)Load JSON (array, object, or JSON-lines) into a DataFrame.
write_json(df, path)Write a DataFrame to JSON and return row count.

Parquet and stream I/O

Use parquet for batch files; UDP, WebSocket, and Kafka functions are commonly used in Stream { ... } pipelines.

read_parquet(path)Load Apache Parquet into a DataFrame.
write_parquet(df, path)Write a DataFrame to Parquet and return row count.
kafka_recv(brokers, topic, group, schema[, options])Poll one JSON Kafka message, decode it with an explicit schema, and return a one-row DataFrame or StreamTimeout.
kafka_send(df, brokers, topic[, options])Serialize each DataFrame row to one JSON Kafka message and return sent-row count.
udp_recv(port)Read rows from UDP into a DataFrame batch.
udp_send(df, host, port)Send a DataFrame batch via UDP and return sent-row count.
ws_listen(port)Start a WebSocket listener.
ws_recv(port)Receive WebSocket messages as DataFrame batches.
ws_send(df, port)Broadcast a DataFrame batch to connected WebSocket clients.