Metrics

Metric nodes represent an aggregation of a measure defined as a single expression in a query that selects from a single source, transform, or dimension node.

AttributeDescriptionType
nameUnique name used by other nodes to select from this nodestring
display_nameA human readable name for the nodestring
descriptionA human readable description of the nodestring
modepublished or draft (see Node Mode)string
queryA SQL query that selects a single expression from a single nodestring

Creating Metric Nodes

curl -X POST http://localhost:8000/nodes/metric/ \
-H 'Content-Type: application/json' \
-d '{
    "name": "default.num_repair_orders",
    "description": "Number of repair orders",
    "mode": "published",
    "query": "SELECT count(repair_order_id) as num_repair_orders FROM default.repair_orders"
}'
from datajunction import DJBuilder, NodeMode
dj = DJBuilder(DJ_URL)

metric = dj.create_metric(
    name="default.num_repair_orders",
    description="Number of repair orders",
    query="SELECT count(repair_order_id) FROM repair_orders",
    mode=NodeMode.PUBLISHED,  # for draft nodes, use `mode=NodeMode.DRAFT`
)
print(metric.name)
dj.metrics.create(
    {
        name: "default.num_repair_orders",
        description: "Number of repair orders",
        mode: "published",
        query: `
            SELECT
            count(repair_order_id) as num_repair_orders
            FROM default.repair_orders
        `
    }
).then(data => console.log(data))

Supported Aggregation Functions

DJ supports a wide range of SQL aggregation functions for defining metrics. These functions are categorized by their aggregability - whether they can be pre-aggregated for efficient cube materialization.

Fully Aggregatable Functions

These functions can be pre-aggregated at any grain level, enabling efficient cube materializations.

SUM

Calculates the sum of values.

SELECT SUM(total_repair_cost) FROM default.repair_orders

COUNT

Counts the number of rows.

SELECT COUNT(repair_order_id) FROM default.repair_orders

COUNT_IF

Counts rows that match a condition.

SELECT COUNT_IF(status = 'completed') FROM default.repair_orders

MIN

Returns the minimum value.

SELECT MIN(order_date) FROM default.repair_orders

MAX

Returns the maximum value.

SELECT MAX(total_repair_cost) FROM default.repair_orders

AVG

Calculates the average. Internally decomposed into SUM and COUNT for pre-aggregation.

SELECT AVG(total_repair_cost) FROM default.repair_orders

ANY_VALUE

Returns any value from the group.

SELECT ANY_VALUE(category) FROM default.repair_orders

APPROX_COUNT_DISTINCT

Approximate distinct count using HyperLogLog sketches. Useful for large datasets where exact distinct counts are expensive.

SELECT APPROX_COUNT_DISTINCT(customer_id) FROM default.repair_orders

Statistical Functions

These functions compute statistical measures and are fully decomposable for pre-aggregation.

VAR_POP / VAR_SAMP / VARIANCE

Population variance, sample variance (with Bessel’s correction), and VARIANCE (alias for VAR_SAMP).

SELECT VAR_POP(total_repair_cost) FROM default.repair_orders
SELECT VAR_SAMP(total_repair_cost) FROM default.repair_orders
SELECT VARIANCE(total_repair_cost) FROM default.repair_orders

STDDEV_POP / STDDEV_SAMP / STDDEV

Population standard deviation, sample standard deviation, and STDDEV (alias for STDDEV_SAMP).

SELECT STDDEV_POP(total_repair_cost) FROM default.repair_orders
SELECT STDDEV_SAMP(total_repair_cost) FROM default.repair_orders
SELECT STDDEV(total_repair_cost) FROM default.repair_orders

COVAR_POP / COVAR_SAMP

Population and sample covariance between two variables.

SELECT COVAR_POP(labor_cost, parts_cost) FROM default.repair_orders
SELECT COVAR_SAMP(labor_cost, parts_cost) FROM default.repair_orders

CORR

Pearson correlation coefficient between two variables.

SELECT CORR(labor_cost, parts_cost) FROM default.repair_orders

Non-Aggregatable Functions

These functions require access to the full dataset and cannot be pre-aggregated in cubes.

MAX_BY / MIN_BY

Returns a value from the row with the maximum/minimum value of another column.

SELECT MAX_BY(technician_name, repair_count) FROM default.repair_orders
SELECT MIN_BY(technician_name, repair_count) FROM default.repair_orders

DISTINCT Aggregations

Adding DISTINCT to any aggregation function makes it limited aggregatable - it can only be pre-aggregated at or below the grain of the distinct columns.

SELECT COUNT(DISTINCT customer_id) FROM default.repair_orders
SELECT SUM(DISTINCT order_amount) FROM default.repair_orders

Derived Metrics

Derived metrics reference other metrics, allowing you to build complex calculations from simpler components. They automatically inherit the components from their base metrics, enabling proper decomposition for cube materialization.

Simple Derived Metrics

The simplest derived metrics combine existing metrics with arithmetic operations.

-- Average order value from two base metrics
SELECT default.total_revenue / default.total_orders AS average_order_value

Cross-Fact Ratio Metrics

These metrics have numerator and denominator from different fact tables. This is useful for metrics like conversion rates or efficiency ratios.

Example: Order Conversion Rate

-- Metric 1: Count of completed orders (from orders fact)
SELECT COUNT(order_id) FROM default.completed_orders

-- Metric 2: Count of page visits (from visits fact)
SELECT COUNT(visit_id) FROM default.page_visits

-- Derived metric: Conversion rate
SELECT default.completed_order_count / default.page_visit_count AS conversion_rate

Period-Over-Period Metrics

Period-over-period metrics compare values across time periods, such as week-over-week or year-over-year growth. Since DJ metrics cannot contain WHERE clauses, use conditional aggregation or window functions.

Example: Week-Over-Week Revenue Growth (using conditional aggregation)

Create metrics that filter time periods within the SELECT clause:

-- Current week revenue
SELECT SUM(CASE WHEN order_date >= DATE_SUB(CURRENT_DATE, 7) THEN revenue ELSE 0 END)
FROM default.sales

-- Previous week revenue
SELECT SUM(CASE WHEN order_date >= DATE_SUB(CURRENT_DATE, 14)
                AND order_date < DATE_SUB(CURRENT_DATE, 7) THEN revenue ELSE 0 END)
FROM default.sales

Then create the growth metric:

-- Week-over-week growth rate
SELECT (default.current_week_revenue - default.previous_week_revenue)
       / NULLIF(default.previous_week_revenue, 0) AS wow_revenue_growth

Example: Year-Over-Year Using Window Functions

For YoY comparisons with a date dimension, use LAG:

SELECT (revenue - LAG(revenue, 1) OVER (ORDER BY year))
       / NULLIF(LAG(revenue, 1) OVER (ORDER BY year), 0) AS yoy_revenue_growth
FROM default.yearly_revenue

Trailing N-Day Metrics

Trailing metrics aggregate data over a rolling window, such as trailing 7-day or 30-day averages. These are useful for smoothing out daily fluctuations.

Example: Trailing 7-Day Average Revenue

-- Daily revenue metric
SELECT SUM(revenue) FROM default.sales

-- Trailing 7-day average (as a derived metric with window logic)
SELECT AVG(default.daily_revenue) OVER (
  ORDER BY date_dim
  ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS trailing_7d_avg_revenue

Example: Trailing 30-Day Total Orders

SELECT SUM(default.daily_order_count) OVER (
  ORDER BY date_dim
  ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
) AS trailing_30d_orders

Combining Derived Metric Patterns

You can combine these patterns to create sophisticated metrics:

-- Trailing 7-day conversion rate (combines trailing window + cross-fact ratio)
SELECT default.trailing_7d_orders / default.trailing_7d_visits AS trailing_7d_conversion_rate

-- YoY growth of category share (combines period-over-period + share of total)
SELECT (default.current_year_category_share - default.previous_year_category_share)
AS category_share_yoy_change

Share of Total Metrics (Limited Support)

Share of total metrics calculate a proportion where the numerator is grouped by a dimension while the denominator represents the ungrouped total. This pattern requires special handling because DJ computes all metrics at the same grain.

Option 1: Conditional Aggregation with Fixed Filter

Create a metric for a specific category’s share:

-- Electronics share of total revenue (hardcoded category)
SELECT SUM(CASE WHEN category = 'Electronics' THEN revenue ELSE 0 END) / SUM(revenue)
FROM default.sales

This approach requires creating a separate metric for each category you want to track.

Option 2: Application Layer

Create separate metrics and compute the ratio at the application layer:

-- Metric 1: Revenue (will be grouped by category when queried)
SELECT SUM(revenue) FROM default.sales

-- Metric 2: Total revenue (query separately without category dimension)
SELECT SUM(revenue) FROM default.sales

Then divide in your application code after querying each metric at its appropriate grain.

Option 2: Pre-computed Totals

If you have a transform node that pre-computes totals, you can reference it:

-- Assuming default.sales_with_totals has both line-level and total columns
SELECT SUM(revenue) / MAX(total_revenue) FROM default.sales_with_totals

Conditional Aggregations

You can use CASE WHEN expressions inside aggregation functions to create metrics that only aggregate values meeting certain conditions.

Example: Revenue from Completed Orders Only

SELECT SUM(CASE WHEN status = 'completed' THEN revenue ELSE 0 END)
FROM default.orders

Example: Completion Rate

SELECT SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) / COUNT(*)
FROM default.orders

Example: Average for High-Value Orders

SELECT AVG(CASE WHEN order_value > 100 THEN order_value ELSE NULL END)
FROM default.orders

Metric Metadata

Beyond the SQL query, metrics support additional metadata to improve discoverability and usability.

Required Dimensions

Some metrics only make sense when grouped by specific dimensions. You can specify required dimensions that must be included when querying the metric.

{
  "name": "default.market_share",
  "query": "SELECT SUM(revenue) / SUM(total_market_revenue) FROM default.sales",
  "required_dimensions": ["default.product.category", "default.date.quarter"]
}

When a metric has required dimensions, DJ will enforce that these dimensions are included in any query using this metric.

Metric Direction

Indicate whether higher or lower values are “better” for the metric. This helps consumers understand how to interpret changes.

DirectionDescriptionExample Metrics
higher_is_betterIncreasing values are positiveRevenue, Conversion Rate, NPS
lower_is_betterDecreasing values are positiveChurn Rate, Error Rate, Latency
neutralDirection doesn’t indicate good/badCount of Users, Average Order Size
{
  "name": "default.customer_churn_rate",
  "query": "SELECT ...",
  "direction": "lower_is_better"
}

Metric Units

Specify the unit of measure for the metric to help consumers interpret values correctly.

UnitAbbreviationDescription
percentage%Values from 0 to 100
proportionValues from 0 to 1
dollar$Currency (USD)
secondsTime duration
minutemTime duration
hourhTime duration
daydTime duration
weekwTime duration
unitlessNo specific unit
{
  "name": "default.avg_response_time",
  "query": "SELECT AVG(response_time_ms) / 1000 FROM default.requests",
  "unit": "second"
}