Bar Types and Aggregation Methods for Algorithmic Trading
Every candlestick chart you've ever seen on Binance, TradingView, or any exchange UI is built the same way: aggregate trades within a fixed time window — 1 minute, 5 minutes, 1 hour — and produce an OHLCV bar. This is so ubiquitous that most traders never question it. But for algorithmic trading, the choice of bar type and aggregation method are two independent decisions — and most systems conflate them.
This article separates the two axes of candle construction: what kind of bar you build (17 types) and how you aggregate them into higher timeframes (3 methods). The combination gives 51 possible configurations, each with different properties for backtesting, live trading, and signal generation.
For an introduction to how raw trades become standard candles, see Trading Candles Demystified.
TL;DR
- Candle construction has two independent axes: bar type and aggregation method
- 17 base bar types: time, tick, volume, dollar, Renko, range, volatility, Heikin-Ashi, Kagi, Line Break, P&F, tick imbalance (TIB), volume imbalance (VIB), run, CUSUM, entropy, delta
- 3 aggregation methods: calendar-aligned, rolling window, adaptive rolling
- 17 × 3 = 51 possible combinations, each with different properties
- Most systems use only one combination: calendar-aligned time bars. The other 50 are untapped.
- Practical recommendation: use multiple combinations in layers — rolling time bars for signals, calendar time bars for market structure, information-driven bars for microstructure
Two Axes of Candle Construction
The traditional view puts all bar types on a flat list: time bars, tick bars, volume bars, Renko, etc. This is misleading. There are actually two orthogonal choices:
Axis 1 — Base Bar Type (17 types): How do you decide when a new bar closes? After a fixed time interval? After N trades? After a price movement? When information content changes? This determines what "one bar" means.
Axis 2 — Aggregation Method (3 methods): How do you compose base bars into higher-timeframe candles? Align to calendar boundaries (00:00, 01:00, ...)? Use a rolling window of the last N bars? Adapt the window size to volatility?
These two axes are independent. You can have:
- Calendar-aligned tick bars — aggregate tick bars that closed between 14:00 and 14:59 into a single hourly candle
- Rolling volume bars — take the last 24 volume bars regardless of when they closed
- Adaptive delta bars — use a volatility-driven window over delta bars
The standard "1-hour candle" is just one point in this 17×3 matrix: time bars + calendar alignment. Every other combination is an alternative worth considering.
1. Time Bars (Standard)
Uneven information density: rigid time boundaries treat 200-trade quiet hours the same as 50,000-trade announcement hours.
The default. A new bar forms after a fixed time interval: 1 minute, 5 minutes, 1 hour. Every exchange provides these natively.
Properties:
- During the Asian session (00:00–08:00 UTC), a 1-hour candle might contain 200 trades. During a Binance listing announcement, the same window could contain 50,000 trades. Time bars treat both as equivalent. Detecting such activity spikes is critical for bot protection — see Anomaly Detection for Trading Bots.
- All market participants see the same candle boundaries — a Schelling point. This makes time bars essential for analyzing crowd behavior.
- Indicators computed on partial candles (after restart) produce garbage values.
from datetime import datetime
def time_until_valid_hourly_candle():
"""How long until the first complete hourly candle after restart."""
now = datetime.utcnow()
minutes_into_hour = now.minute
seconds_into_minute = now.second
wait_seconds = (60 - minutes_into_hour) * 60 - seconds_into_minute
wait_seconds += 3600
return wait_seconds
2–4. Activity-Based Bars
Tick, volume, and dollar bars: three ways to let market participation — not the clock — determine bar boundaries.
Instead of sampling at fixed time intervals, sample after a fixed amount of market activity. This produces bars with roughly equal "information content" regardless of time of day.
2. Tick Bars
A new bar forms after every N trades (ticks). During high activity, bars form rapidly. During quiet periods, a single bar might span hours.
from collections import deque
from dataclasses import dataclass
@dataclass
class OHLCV:
timestamp: int
open: float
high: float
low: float
close: float
volume: float
class TickBarGenerator:
"""
Generates a new bar every `threshold` trades.
Each bar contains equal number of market "opinions".
"""
def __init__(self, threshold: int = 1000):
self.threshold = threshold
self.trades: list[tuple[float, float]] = [] # (price, qty)
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float):
self.trades.append((price, qty))
if len(self.trades) >= self.threshold:
self._close_bar(timestamp)
def _close_bar(self, timestamp: int):
prices = [t[0] for t in self.trades]
volumes = [t[1] for t in self.trades]
bar = OHLCV(
timestamp=timestamp,
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
self.trades = []
return bar
Pros: Adapts to market activity naturally. Returns from tick bars tend to be closer to normally distributed than time-bar returns — a property that improves the performance of many statistical models.
Cons: Requires raw trade stream (not available from all data providers for historical data). Bar timing is unpredictable — you can't say "the next bar will close at X."
3. Volume Bars
A new bar forms after N contracts (or coins, in crypto) have traded. Similar to tick bars, but weighted by trade size — a single 100-BTC trade contributes 100x more than a 1-BTC trade.
class VolumeBarGenerator:
"""
Generates a new bar every `threshold` units of volume.
Normalizes for trade size: one large order ≠ one small order.
"""
def __init__(self, threshold: float = 100.0):
self.threshold = threshold
self.accumulated_volume = 0.0
self.trades: list[tuple[int, float, float]] = [] # (ts, price, qty)
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float):
self.trades.append((timestamp, price, qty))
self.accumulated_volume += qty
if self.accumulated_volume >= self.threshold:
self._close_bar()
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
self.accumulated_volume = 0.0
self.trades = []
return bar
4. Dollar Bars
A new bar forms after a fixed notional value (in USD/USDT) has been exchanged. The most robust of the activity-based bars because it normalizes for both trade count and price level.
Consider: if ETH goes from 4,000, selling 4,000 but 10 ETH at $1,000. Volume bars would treat these differently; dollar bars treat them the same.
class DollarBarGenerator:
"""
Generates a new bar every `threshold` dollars (USDT) of notional volume.
Most robust normalization: independent of price level.
Lopez de Prado (2018) recommends dollar bars as the default
for most quantitative applications.
"""
def __init__(self, threshold: float = 1_000_000.0):
self.threshold = threshold
self.accumulated_dollars = 0.0
self.trades: list[tuple[int, float, float]] = []
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float):
self.trades.append((timestamp, price, qty))
self.accumulated_dollars += price * qty
if self.accumulated_dollars >= self.threshold:
self._close_bar()
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
self.accumulated_dollars = 0.0
self.trades = []
return bar
Choosing the Threshold
The threshold for activity-based bars should produce roughly the same number of bars per day as the time bars you're replacing. For BTCUSDT on Binance:
| Bar Type | Typical Threshold | ~Bars/Day | Equivalent TF |
|---|---|---|---|
| Tick | 1,000 trades | ~1,400 | ~1m |
| Tick | 50,000 trades | ~28 | ~1h |
| Volume | 100 BTC | ~600 | ~2-3m |
| Volume | 2,400 BTC | ~25 | ~1h |
| Dollar | $1M | ~1,400 | ~1m |
| Dollar | $50M | ~28 | ~1h |
These numbers are approximate and shift dramatically with market regime. During a rally or crash, activity-based bars will produce 5-10x more bars than usual — which is exactly the point.
5–7. Price-Based Bars
Renko bricks, range bars, and volatility bars: sampling only when price moves enough to matter.
Price-based bars ignore both time and activity. A new bar forms only when price moves by a specified amount. This naturally filters sideways noise and highlights trends.
5. Renko Bars
A new Renko "brick" forms when the closing price moves by at least N units from the previous brick's close. Bricks are always the same size, creating a clean visual representation of trend direction.
class RenkoBarGenerator:
"""
Generates Renko bricks based on price movement.
Key property: during sideways movement, no new bricks form.
During strong trends, bricks form rapidly.
"""
def __init__(self, brick_size: float = 10.0):
self.brick_size = brick_size
self.bricks: list[dict] = []
self.last_close: float | None = None
def on_price(self, timestamp: int, price: float, volume: float = 0.0):
if self.last_close is None:
self.last_close = price
return []
new_bricks = []
diff = price - self.last_close
num_bricks = int(abs(diff) / self.brick_size)
if num_bricks == 0:
return []
direction = 1 if diff > 0 else -1
for i in range(num_bricks):
brick_open = self.last_close
brick_close = self.last_close + direction * self.brick_size
brick = {
'timestamp': timestamp,
'open': brick_open,
'high': max(brick_open, brick_close),
'low': min(brick_open, brick_close),
'close': brick_close,
'volume': volume / num_bricks if num_bricks > 0 else 0,
'direction': direction,
}
new_bricks.append(brick)
self.last_close = brick_close
self.bricks.extend(new_bricks)
return new_bricks
Dynamic Renko uses ATR (Average True Range) instead of a fixed brick size, adapting to volatility automatically.
6. Range Bars
Each bar has a fixed high-low range. When the range is exceeded, the bar closes and a new one begins. Unlike Renko, range bars include wicks and can show intra-bar volatility.
class RangeBarGenerator:
"""
Generates bars with a fixed high-low range.
Difference from Renko: range bars show the full OHLC within
the range, not just brick direction. More information-rich.
"""
def __init__(self, range_size: float = 20.0):
self.range_size = range_size
self.current_high: float | None = None
self.current_low: float | None = None
self.current_open: float | None = None
self.current_volume: float = 0.0
self.current_start_ts: int = 0
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float):
if self.current_open is None:
self.current_open = price
self.current_high = price
self.current_low = price
self.current_start_ts = timestamp
self.current_high = max(self.current_high, price)
self.current_low = min(self.current_low, price)
self.current_volume += qty
if self.current_high - self.current_low >= self.range_size:
bar = OHLCV(
timestamp=timestamp,
open=self.current_open,
high=self.current_high,
low=self.current_low,
close=price,
volume=self.current_volume,
)
self.bars.append(bar)
self.current_open = price
self.current_high = price
self.current_low = price
self.current_volume = 0.0
self.current_start_ts = timestamp
return bar
return None
Key difference between Renko and Range bars: Renko tracks only closing prices and shows direction; range bars track the full price range and show structure within the bar. Range bars are generally more useful for algorithmic trading because they preserve high-low information needed for stop-loss and take-profit simulation.
7. Volatility Bars
A new bar forms when the intra-bar volatility reaches a dynamic threshold — for example, a multiple of recent ATR. Unlike range bars (fixed threshold), volatility bars adapt to market conditions.
class VolatilityBarGenerator:
"""
Generates bars when intra-bar volatility reaches a threshold.
Similar to range bars, but the threshold adapts to market conditions
using a rolling ATR measure. In calm markets, bars need less
absolute movement to close; in volatile markets, more.
"""
def __init__(
self,
atr_period: int = 14,
atr_multiplier: float = 1.0,
initial_threshold: float = 20.0,
):
self.atr_period = atr_period
self.atr_multiplier = atr_multiplier
self.threshold = initial_threshold
self.recent_ranges: list[float] = []
self.current_open: float | None = None
self.current_high: float | None = None
self.current_low: float | None = None
self.current_volume: float = 0.0
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float):
if self.current_open is None:
self.current_open = price
self.current_high = price
self.current_low = price
self.current_high = max(self.current_high, price)
self.current_low = min(self.current_low, price)
self.current_volume += qty
intra_bar_range = self.current_high - self.current_low
if intra_bar_range >= self.threshold:
bar = OHLCV(
timestamp=timestamp,
open=self.current_open,
high=self.current_high,
low=self.current_low,
close=price,
volume=self.current_volume,
)
self.bars.append(bar)
self.recent_ranges.append(intra_bar_range)
if len(self.recent_ranges) > self.atr_period:
self.recent_ranges = self.recent_ranges[-self.atr_period:]
if len(self.recent_ranges) >= self.atr_period:
avg_range = sum(self.recent_ranges) / len(self.recent_ranges)
self.threshold = avg_range * self.atr_multiplier
self.current_open = price
self.current_high = price
self.current_low = price
self.current_volume = 0.0
return bar
return None
8. Heikin-Ashi (Smoothed Transformation)
Heikin-Ashi: averaging transforms noisy candles into smooth trend signals — but at the cost of exact price information.
Heikin-Ashi (Japanese for "average bar") is not a bar type — it's a transformation that can be applied on top of any base bar type. It smooths candles by averaging current and previous bar values:
- HA Close = (Open + High + Low + Close) / 4
- HA Open = (Previous HA Open + Previous HA Close) / 2
- HA High = max(High, HA Open, HA Close)
- HA Low = min(Low, HA Open, HA Close)
Trends appear as sequences of same-colored candles with no lower wicks (uptrend) or no upper wicks (downtrend).
class HeikinAshiTransformer:
"""
Transforms standard OHLCV candles into Heikin-Ashi candles.
Can be applied on top of ANY bar type: time bars, volume bars,
rolling bars, etc. It's a transformation, not a sampling method.
WARNING: HA prices are synthetic — they don't represent real
traded prices. Never use HA close for order placement or
PnL calculation. Use HA only for signal generation, then
execute at real prices.
"""
def __init__(self):
self.prev_ha_open: float | None = None
self.prev_ha_close: float | None = None
def transform(self, candle: OHLCV) -> OHLCV:
ha_close = (candle.open + candle.high + candle.low + candle.close) / 4
if self.prev_ha_open is None:
ha_open = (candle.open + candle.close) / 2
else:
ha_open = (self.prev_ha_open + self.prev_ha_close) / 2
ha_high = max(candle.high, ha_open, ha_close)
ha_low = min(candle.low, ha_open, ha_close)
self.prev_ha_open = ha_open
self.prev_ha_close = ha_close
return OHLCV(
timestamp=candle.timestamp,
open=ha_open,
high=ha_high,
low=ha_low,
close=ha_close,
volume=candle.volume,
)
def transform_series(self, candles: list[OHLCV]) -> list[OHLCV]:
"""Transform an entire series. Resets state first."""
self.prev_ha_open = None
self.prev_ha_close = None
return [self.transform(c) for c in candles]
def ha_trend_signal(ha_candles: list[OHLCV], lookback: int = 3) -> int:
"""
Simple HA trend signal.
Returns:
+1: bullish (N consecutive green HA candles with no lower wick)
-1: bearish (N consecutive red HA candles with no upper wick)
0: no clear trend
"""
if len(ha_candles) < lookback:
return 0
recent = ha_candles[-lookback:]
all_bullish = all(
c.close > c.open and abs(c.low - min(c.open, c.close)) < 1e-10
for c in recent
)
all_bearish = all(
c.close < c.open and abs(c.high - max(c.open, c.close)) < 1e-10
for c in recent
)
if all_bullish:
return 1
elif all_bearish:
return -1
return 0
Critical caveat for backtesting: Heikin-Ashi prices are synthetic. If your backtest uses HA close as the entry price, results will be wrong. Always use HA for signal generation only and execute at real OHLC prices.
When HA is useful: Trend-following strategies that need clean "stay in" signals. Apply HA over any base bar type — time bars, volume bars, dollar bars — to filter false crossovers.
When HA is harmful: Any strategy that needs precise price levels — support/resistance, order book analysis, PIQ (Position In Queue). The averaging destroys exact price information.
9–11. Japanese Reversal Charts
Kagi, Line Break, and Point & Figure: time-free charting methods that focus purely on price structure.
These are traditional Japanese charting methods (alongside Renko) that discard time entirely and focus on price structure.
9. Kagi Charts
Kagi charts consist of vertical lines that change direction when price reverses by a specified amount. Lines change thickness when price breaks a previous high (thick = "yang" = demand) or previous low (thin = "yin" = supply).
class KagiChartGenerator:
"""
Generates Kagi chart lines based on price reversals.
Unlike Renko (fixed brick size), Kagi tracks the actual magnitude
of each move and changes line thickness at breakout points.
Useful for identifying support/resistance breaks and
supply/demand shifts without time noise.
"""
def __init__(self, reversal_amount: float = 10.0):
self.reversal_amount = reversal_amount
self.lines: list[dict] = []
self.current_direction: int = 0 # 1=up, -1=down
self.current_price: float | None = None
self.extreme_price: float | None = None
self.prev_high: float | None = None
self.prev_low: float | None = None
self.line_type: str = 'yang' # 'yang' (thick) or 'yin' (thin)
def on_price(self, timestamp: int, price: float):
if self.current_price is None:
self.current_price = price
self.extreme_price = price
return None
if self.current_direction == 0:
if price - self.current_price >= self.reversal_amount:
self.current_direction = 1
self.extreme_price = price
elif self.current_price - price >= self.reversal_amount:
self.current_direction = -1
self.extreme_price = price
return None
if self.current_direction == 1:
if price > self.extreme_price:
self.extreme_price = price
if self.prev_high is not None and price > self.prev_high:
self.line_type = 'yang'
elif self.extreme_price - price >= self.reversal_amount:
line = {
'timestamp': timestamp,
'start': self.current_price,
'end': self.extreme_price,
'direction': 'up',
'type': self.line_type,
}
self.lines.append(line)
self.prev_high = self.extreme_price
self.current_price = self.extreme_price
self.extreme_price = price
self.current_direction = -1
if self.prev_low is not None and price < self.prev_low:
self.line_type = 'yin'
return line
else:
if price < self.extreme_price:
self.extreme_price = price
if self.prev_low is not None and price < self.prev_low:
self.line_type = 'yin'
elif price - self.extreme_price >= self.reversal_amount:
line = {
'timestamp': timestamp,
'start': self.current_price,
'end': self.extreme_price,
'direction': 'down',
'type': self.line_type,
}
self.lines.append(line)
self.prev_low = self.extreme_price
self.current_price = self.extreme_price
self.extreme_price = price
self.current_direction = 1
if self.prev_high is not None and price > self.prev_high:
self.line_type = 'yang'
return line
return None
10. Line Break Charts
Line break charts draw a new line (box) only when the closing price exceeds the high or low of the previous N lines (typically 3). No new line is drawn if the price stays within the range.
class LineBreakGenerator:
"""
Generates Line Break bars (Three Line Break by default).
A new bar is drawn only when the close exceeds the high or low
of the last N bars. Filters out minor noise by requiring price
to break through a multi-bar range.
The 'N' parameter (line_count) controls sensitivity:
- N=2: more sensitive, more bars, more noise
- N=3: standard (Three Line Break)
- N=4+: less sensitive, fewer bars, stronger signals
"""
def __init__(self, line_count: int = 3):
self.line_count = line_count
self.lines: list[dict] = []
def on_close(self, timestamp: int, close: float) -> dict | None:
if not self.lines:
self.lines.append({
'timestamp': timestamp,
'open': close,
'close': close,
'high': close,
'low': close,
'direction': 0,
})
return None
lookback = self.lines[-self.line_count:] if len(self.lines) >= self.line_count else self.lines
highest = max(l['high'] for l in lookback)
lowest = min(l['low'] for l in lookback)
last = self.lines[-1]
new_line = None
if close > highest:
new_line = {
'timestamp': timestamp,
'open': last['close'],
'close': close,
'high': close,
'low': last['close'],
'direction': 1,
}
elif close < lowest:
new_line = {
'timestamp': timestamp,
'open': last['close'],
'close': close,
'high': last['close'],
'low': close,
'direction': -1,
}
if new_line:
self.lines.append(new_line)
return new_line
return None
11. Point & Figure Charts
Point & Figure (P&F) charts use columns of X's (rising prices) and O's (falling prices). Column switches require a reversal of typically 3 box sizes. One of the oldest methods of filtering noise and identifying support/resistance.
class PointAndFigureGenerator:
"""
Generates Point & Figure chart data.
X column: price rising by box_size increments.
O column: price falling by box_size increments.
Column switch: requires reversal_boxes * box_size movement
in the opposite direction.
Classic setting: box_size based on ATR, reversal_boxes = 3.
"""
def __init__(self, box_size: float = 10.0, reversal_boxes: int = 3):
self.box_size = box_size
self.reversal_boxes = reversal_boxes
self.reversal_amount = box_size * reversal_boxes
self.columns: list[dict] = []
self.current_direction: int = 0
self.current_top: float | None = None
self.current_bottom: float | None = None
def on_price(self, timestamp: int, price: float):
if self.current_top is None:
box_price = self._round_to_box(price)
self.current_top = box_price
self.current_bottom = box_price
self.current_direction = 1
return None
events = []
if self.current_direction == 1:
while price >= self.current_top + self.box_size:
self.current_top += self.box_size
events.append(('X', self.current_top, timestamp))
if price <= self.current_top - self.reversal_amount:
col = {
'type': 'X',
'top': self.current_top,
'bottom': self.current_bottom,
'boxes': int((self.current_top - self.current_bottom) / self.box_size) + 1,
'timestamp': timestamp,
}
self.columns.append(col)
self.current_direction = -1
self.current_top = self.current_top - self.box_size
self.current_bottom = self._round_to_box(price)
events.append(('new_column', 'O', timestamp))
else:
while price <= self.current_bottom - self.box_size:
self.current_bottom -= self.box_size
events.append(('O', self.current_bottom, timestamp))
if price >= self.current_bottom + self.reversal_amount:
col = {
'type': 'O',
'top': self.current_top,
'bottom': self.current_bottom,
'boxes': int((self.current_top - self.current_bottom) / self.box_size) + 1,
'timestamp': timestamp,
}
self.columns.append(col)
self.current_direction = 1
self.current_bottom = self.current_bottom + self.box_size
self.current_top = self._round_to_box(price)
events.append(('new_column', 'X', timestamp))
return events if events else None
def _round_to_box(self, price: float) -> float:
return round(price / self.box_size) * self.box_size
Kagi, Line Break, and P&F in algorithmic trading: Primarily used for long-term trend detection and support/resistance identification. As a filter layer — "don't take long signals when the Kagi chart is in yin mode" — they add value by aligning trades with the macro structure.
12–14. Information-Driven Bars
Imbalance bars, run bars, CUSUM filters, and entropy bars: sampling when the market tells us something has changed.
The most sophisticated approach, from Marcos Lopez de Prado's Advances in Financial Machine Learning (2018). The core insight: sample when new information arrives to the market, not at fixed intervals.
12. Tick Imbalance Bars (TIB)
If the market is in equilibrium, buyer-initiated and seller-initiated trades should roughly balance. When the imbalance exceeds our expectation, something has changed. Sample a bar at that moment.
Each trade is classified as buyer-initiated (+1) or seller-initiated (-1) using the tick rule. We track cumulative imbalance θ and sample when |θ| exceeds a dynamic threshold.
class TickImbalanceBarGenerator:
"""
Generates bars when the cumulative tick imbalance exceeds
expected levels — i.e., when "new information" arrives.
Based on Lopez de Prado (2018), Chapter 2.
"""
def __init__(
self,
expected_ticks_init: int = 1000,
ewma_window: int = 100,
min_ticks: int = 100,
max_ticks: int = 50000,
):
self.expected_ticks_init = expected_ticks_init
self.ewma_window = ewma_window
self.min_ticks = min_ticks
self.max_ticks = max_ticks
self.theta = 0.0
self.prev_price: float | None = None
self.prev_sign = 1
self.trades: list[tuple[int, float, float]] = []
self.bar_lengths: list[int] = []
self.imbalances: list[float] = []
self.expected_ticks = float(expected_ticks_init)
self.expected_imbalance = 0.0
self.bars: list[OHLCV] = []
def _tick_sign(self, price: float) -> int:
"""Classify trade as buy (+1) or sell (-1) using tick rule."""
if self.prev_price is None:
self.prev_price = price
return 1
if price > self.prev_price:
sign = 1
elif price < self.prev_price:
sign = -1
else:
sign = self.prev_sign
self.prev_price = price
self.prev_sign = sign
return sign
def on_trade(self, timestamp: int, price: float, qty: float):
sign = self._tick_sign(price)
self.theta += sign
self.trades.append((timestamp, price, qty))
threshold = self.expected_ticks * abs(self.expected_imbalance)
if threshold == 0:
threshold = self.expected_ticks_init * 0.5
if abs(self.theta) >= threshold and len(self.trades) >= self.min_ticks:
return self._close_bar()
if len(self.trades) >= self.max_ticks:
return self._close_bar()
return None
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
self.bar_lengths.append(len(self.trades))
self.imbalances.append(self.theta / len(self.trades))
if len(self.bar_lengths) >= 2:
alpha = 2.0 / (self.ewma_window + 1)
self.expected_ticks = (
alpha * self.bar_lengths[-1]
+ (1 - alpha) * self.expected_ticks
)
self.expected_ticks = max(
self.min_ticks,
min(self.max_ticks, self.expected_ticks)
)
self.expected_imbalance = (
alpha * self.imbalances[-1]
+ (1 - alpha) * self.expected_imbalance
)
self.theta = 0.0
self.trades = []
return bar
13. Volume Imbalance Bars (VIB)
Extension of TIBs: instead of counting each trade as ±1, weight by signed volume. A 100-BTC buy contributes +100, a 1-BTC sell contributes -1. Captures large informed orders that might be split into many small trades.
class VolumeImbalanceBarGenerator:
"""
Like TIBs, but uses signed volume instead of signed ticks.
Captures the insight that a 100-BTC buy signal is 100x more
informative than a 1-BTC buy signal.
"""
def __init__(
self,
expected_ticks_init: int = 1000,
ewma_window: int = 100,
):
self.expected_ticks_init = expected_ticks_init
self.ewma_window = ewma_window
self.theta = 0.0
self.prev_price: float | None = None
self.prev_sign = 1
self.trades: list[tuple[int, float, float]] = []
self.bar_lengths: list[int] = []
self.volume_imbalances: list[float] = []
self.expected_ticks = float(expected_ticks_init)
self.expected_vol_imbalance = 0.0
self.bars: list[OHLCV] = []
def _tick_sign(self, price: float) -> int:
if self.prev_price is None:
self.prev_price = price
return 1
if price > self.prev_price:
sign = 1
elif price < self.prev_price:
sign = -1
else:
sign = self.prev_sign
self.prev_price = price
self.prev_sign = sign
return sign
def on_trade(self, timestamp: int, price: float, qty: float):
sign = self._tick_sign(price)
self.theta += sign * qty
self.trades.append((timestamp, price, qty))
threshold = self.expected_ticks * abs(self.expected_vol_imbalance)
if threshold == 0:
threshold = self.expected_ticks_init * 0.5
if abs(self.theta) >= threshold and len(self.trades) >= 10:
return self._close_bar()
return None
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
self.bar_lengths.append(len(self.trades))
self.volume_imbalances.append(self.theta / len(self.trades))
alpha = 2.0 / (self.ewma_window + 1)
if len(self.bar_lengths) >= 2:
self.expected_ticks = (
alpha * self.bar_lengths[-1] + (1 - alpha) * self.expected_ticks
)
self.expected_vol_imbalance = (
alpha * self.volume_imbalances[-1]
+ (1 - alpha) * self.expected_vol_imbalance
)
self.theta = 0.0
self.trades = []
return bar
The Explosion Problem
A known issue with imbalance bars: the EWMA-based threshold can enter a positive feedback loop. The solution: clamp with min_ticks and max_ticks bounds.
self.expected_ticks = max(
self.min_ticks, # Floor: never less than 100 ticks
min(
self.max_ticks, # Ceiling: never more than 50000 ticks
new_expected_ticks
)
)
14. Run Bars
Run bars track the length of the current directional run — the longest consecutive sequence of buys or sells. When a large informed trader splits an order into many small trades, the sequence becomes unusually long. Run bars detect this.
class TickRunBarGenerator:
"""
Generates bars when the length of a directional run exceeds expectations.
Based on Lopez de Prado (2018), Chapter 2.
Difference from imbalance bars:
- Imbalance bars track NET imbalance (buys minus sells)
- Run bars track the MAXIMUM run length (consecutive buys OR sells)
"""
def __init__(
self,
expected_ticks_init: int = 1000,
ewma_window: int = 100,
min_ticks: int = 100,
max_ticks: int = 50000,
):
self.expected_ticks_init = expected_ticks_init
self.ewma_window = ewma_window
self.min_ticks = min_ticks
self.max_ticks = max_ticks
self.prev_price: float | None = None
self.prev_sign = 1
self.trades: list[tuple[int, float, float]] = []
self.buy_run = 0
self.sell_run = 0
self.max_buy_run = 0
self.max_sell_run = 0
self.bar_lengths: list[int] = []
self.max_runs: list[float] = []
self.expected_ticks = float(expected_ticks_init)
self.expected_max_run = 0.0
self.bars: list[OHLCV] = []
def _tick_sign(self, price: float) -> int:
if self.prev_price is None:
self.prev_price = price
return 1
if price > self.prev_price:
sign = 1
elif price < self.prev_price:
sign = -1
else:
sign = self.prev_sign
self.prev_price = price
self.prev_sign = sign
return sign
def on_trade(self, timestamp: int, price: float, qty: float):
sign = self._tick_sign(price)
self.trades.append((timestamp, price, qty))
if sign == 1:
self.buy_run += 1
self.sell_run = 0
else:
self.sell_run += 1
self.buy_run = 0
self.max_buy_run = max(self.max_buy_run, self.buy_run)
self.max_sell_run = max(self.max_sell_run, self.sell_run)
theta = max(self.max_buy_run, self.max_sell_run)
threshold = self.expected_ticks * self.expected_max_run if self.expected_max_run > 0 else self.expected_ticks_init * 0.3
if theta >= threshold and len(self.trades) >= self.min_ticks:
return self._close_bar()
if len(self.trades) >= self.max_ticks:
return self._close_bar()
return None
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
max_run = max(self.max_buy_run, self.max_sell_run) / len(self.trades)
self.bar_lengths.append(len(self.trades))
self.max_runs.append(max_run)
alpha = 2.0 / (self.ewma_window + 1)
if len(self.bar_lengths) >= 2:
self.expected_ticks = alpha * self.bar_lengths[-1] + (1 - alpha) * self.expected_ticks
self.expected_ticks = max(self.min_ticks, min(self.max_ticks, self.expected_ticks))
self.expected_max_run = alpha * self.max_runs[-1] + (1 - alpha) * self.expected_max_run
self.trades = []
self.buy_run = 0
self.sell_run = 0
self.max_buy_run = 0
self.max_sell_run = 0
return bar
Run bars can be extended to volume runs and dollar runs.
15. CUSUM Filter Bars
The CUSUM (Cumulative Sum) filter determines when to sample by tracking cumulative returns. Unlike imbalance bars (which work on raw trades), CUSUM can be applied to existing 1m OHLCV data — no tick data required.
class CUSUMFilterBarGenerator:
"""
Symmetric CUSUM filter for event-based sampling.
Based on Lopez de Prado (2018), Chapter 2.5.
Key advantage over Bollinger Bands: CUSUM requires a FULL
run of threshold magnitude before triggering. Bollinger Bands
trigger repeatedly when price hovers near the band.
Can be applied to 1m OHLCV data — no tick data required.
"""
def __init__(self, threshold: float = 0.01):
self.threshold = threshold
self.s_pos = 0.0
self.s_neg = 0.0
self.prev_price: float | None = None
self.buffer: list[OHLCV] = []
self.bars: list[OHLCV] = []
def on_candle_1m(self, candle: OHLCV) -> OHLCV | None:
self.buffer.append(candle)
if self.prev_price is None:
self.prev_price = candle.close
return None
import math
log_ret = math.log(candle.close / self.prev_price)
self.prev_price = candle.close
self.s_pos = max(0.0, self.s_pos + log_ret)
self.s_neg = min(0.0, self.s_neg + log_ret)
triggered = False
if self.s_pos > self.threshold:
self.s_pos = 0.0
triggered = True
if self.s_neg < -self.threshold:
self.s_neg = 0.0
triggered = True
if triggered and len(self.buffer) >= 2:
bars = self.buffer
bar = OHLCV(
timestamp=bars[-1].timestamp,
open=bars[0].open,
high=max(b.high for b in bars),
low=min(b.low for b in bars),
close=bars[-1].close,
volume=sum(b.volume for b in bars),
)
self.bars.append(bar)
self.buffer = []
return bar
return None
CUSUM + Triple Barrier Method: In Lopez de Prado's framework, CUSUM events are used as entry points for the Triple Barrier method — where each event triggers a trade with stop-loss, take-profit, and expiration barriers. For robust validation of such event-driven strategies, see Walk-Forward Optimization and Monte Carlo Bootstrap for Backtesting.
16. Entropy Bars
The most theoretically elegant approach: sample when the information content (Shannon entropy) of the intra-bar price series exceeds a threshold.
class EntropyBarGenerator:
"""
Generates bars when the entropy of intra-bar returns exceeds
a threshold.
Based on Shannon's information theory: bars are sampled when
"new information" arrives, measured as the entropy of the
return distribution within the current bar.
This is the most theoretically "pure" information-driven bar.
"""
def __init__(
self,
entropy_threshold: float = 2.0,
min_trades: int = 50,
n_bins: int = 10,
):
self.entropy_threshold = entropy_threshold
self.min_trades = min_trades
self.n_bins = n_bins
self.trades: list[tuple[int, float, float]] = []
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float):
self.trades.append((timestamp, price, qty))
if len(self.trades) < self.min_trades:
return None
entropy = self._compute_entropy()
if entropy >= self.entropy_threshold:
return self._close_bar()
return None
def _compute_entropy(self) -> float:
import math
prices = [t[1] for t in self.trades]
if len(prices) < 2:
return 0.0
returns = [
math.log(prices[i] / prices[i-1])
for i in range(1, len(prices))
if prices[i-1] > 0
]
if not returns:
return 0.0
min_r = min(returns)
max_r = max(returns)
if max_r == min_r:
return 0.0
bin_width = (max_r - min_r) / self.n_bins
bins = [0] * self.n_bins
for r in returns:
idx = min(int((r - min_r) / bin_width), self.n_bins - 1)
bins[idx] += 1
total = sum(bins)
entropy = 0.0
for count in bins:
if count > 0:
p = count / total
entropy -= p * math.log2(p)
return entropy
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
self.bars.append(bar)
self.trades = []
return bar
Practical note: Entropy bars are computationally expensive and primarily of research interest — but for ML-based strategies, they produce features with better statistical properties because each bar contains approximately equal "information."
17. Delta Bars (Order Flow)
Cumulative delta: measuring the net force of aggressive buyers vs sellers in real time.
Delta bars sample based on cumulative delta — the running difference between buy volume and sell volume. Unlike imbalance bars (which use tick signs ±1), delta bars use actual volume-weighted order flow.
class DeltaBarGenerator:
"""
Generates bars based on cumulative order flow delta.
Delta = Buy Volume - Sell Volume (classified by aggressor side).
Requires trade-level data with side classification
(available from Binance aggTrades, Bybit trades, etc.)
"""
def __init__(self, threshold: float = 500.0):
self.threshold = threshold
self.cumulative_delta = 0.0
self.trades: list[tuple[int, float, float, int]] = []
self.bars: list[OHLCV] = []
def on_trade(self, timestamp: int, price: float, qty: float, is_buyer_maker: bool):
side = -1 if is_buyer_maker else 1
signed_qty = side * qty
self.cumulative_delta += signed_qty
self.trades.append((timestamp, price, qty, side))
if abs(self.cumulative_delta) >= self.threshold:
return self._close_bar()
return None
def _close_bar(self):
prices = [t[1] for t in self.trades]
volumes = [t[2] for t in self.trades]
bar = OHLCV(
timestamp=self.trades[-1][0],
open=prices[0],
high=max(prices),
low=min(prices),
close=prices[-1],
volume=sum(volumes),
)
bar.delta = self.cumulative_delta # type: ignore
bar.buy_volume = sum(t[2] for t in self.trades if t[3] == 1) # type: ignore
bar.sell_volume = sum(t[2] for t in self.trades if t[3] == -1) # type: ignore
self.bars.append(bar)
self.cumulative_delta = 0.0
self.trades = []
return bar
Delta divergence: One of the most powerful signals — price rising while cumulative delta is negative (sellers are aggressive but price still goes up, indicating limit buy absorption). Directly relevant to the behavioral fingerprinting approach described in the Digital Fingerprint: Trader Identification article. For market makers using the Avellaneda-Stoikov model, delta bars provide a real-time view of inventory risk and aggressor pressure.
A circular buffer of base bars: new data enters, old data exits, and the aggregated candle is always valid.
Aggregation methods determine how base bars are composed into higher-timeframe (HTF) candles. They are independent of the bar type — you can apply any aggregation method to any base bar type.
Method A: Calendar-Aligned Aggregation
Aggregate all base bars that fall within a fixed calendar boundary. The "1-hour" candle covers all bars from 14:00:00 to 14:59:59.
Properties:
- All market participants see the same boundaries — essential for market structure analysis, support/resistance, PIQ triggers
- Cold start problem: partial candle after restart
- Natural for time bars (this is what exchanges provide natively)
- Also works for non-time bars: "all volume bars that closed between 14:00 and 15:00" = a calendar-aligned hourly candle from volume bars
Method B: Rolling Window Aggregation
Aggregate the last N closed base bars, recomputed on every new bar. A "1-hour" rolling candle = the last 60 closed 1-minute time bars, updated every minute.
The atomic unit is the closed base bar. This design choice gives:
- No cold start. After N bars, the candle is valid. No partial-candle noise.
- Backtest parity. If live trading uses the same atomic unit as the backtest engine, signals are identical.
- Simple validation. One rule:
if buffer not full: skip.
import numpy as np
class RollingCandleAggregator:
"""
Produces rolling higher-timeframe candles from closed base bars.
Works with ANY bar type: time bars, tick bars, volume bars,
dollar bars, delta bars — anything that produces OHLCV output.
Example: RollingCandleAggregator(window=60) with 1m time bars
produces a "1h" candle updated every minute.
Example: RollingCandleAggregator(window=24) with volume bars
produces a candle spanning the last 24 volume bars.
"""
def __init__(self, window: int):
self.window = window
self.buffer: deque[OHLCV] = deque(maxlen=window)
def push(self, bar: OHLCV) -> OHLCV | None:
"""
Add a closed base bar. Returns aggregated candle
only when buffer is full (= candle is valid).
"""
self.buffer.append(bar)
if len(self.buffer) < self.window:
return None
return self._aggregate()
def _aggregate(self) -> OHLCV:
bars = list(self.buffer)
return OHLCV(
timestamp=bars[-1].timestamp,
open=bars[0].open,
high=max(b.high for b in bars),
low=min(b.low for b in bars),
close=bars[-1].close,
volume=sum(b.volume for b in bars),
)
@property
def is_valid(self) -> bool:
return len(self.buffer) == self.window
Phase shift trade-off: Rolling candles close at :37 if you started at :37, not at :00 like everyone else's. This matters for strategies that depend on crowd-visible levels. The solution: use both — calendar for market structure, rolling for signals.
Method C: Adaptive Rolling Aggregation
Like rolling, but the window size adapts to current volatility. Calm markets → wider window (more smoothing). Volatile markets → narrower window (faster reaction).
class AdaptiveRollingAggregator:
"""
Rolling window where the window size adapts to volatility.
Works with any base bar type. Uses ATR of recent bars
as the volatility measure.
Low volatility → wider window (more smoothing, fewer signals)
High volatility → narrower window (faster reaction)
"""
def __init__(
self,
base_window: int = 60,
min_window: int = 15,
max_window: int = 240,
atr_period: int = 14,
atr_base: float | None = None,
):
self.base_window = base_window
self.min_window = min_window
self.max_window = max_window
self.atr_period = atr_period
self.atr_base = atr_base
self.all_candles: deque[OHLCV] = deque(maxlen=max_window)
self.atr_values: deque[float] = deque(maxlen=atr_period * 2)
self.current_window = base_window
def push(self, bar: OHLCV) -> OHLCV | None:
self.all_candles.append(bar)
tr = bar.high - bar.low
self.atr_values.append(tr)
if len(self.atr_values) < self.atr_period:
return None
current_atr = sum(list(self.atr_values)[-self.atr_period:]) / self.atr_period
if self.atr_base is None and len(self.atr_values) >= self.atr_period * 2:
self.atr_base = sum(self.atr_values) / len(self.atr_values)
if self.atr_base is None or self.atr_base == 0:
return None
vol_ratio = current_atr / self.atr_base
self.current_window = int(self.base_window / vol_ratio)
self.current_window = max(self.min_window, min(self.max_window, self.current_window))
if len(self.all_candles) < self.current_window:
return None
bars = list(self.all_candles)[-self.current_window:]
return OHLCV(
timestamp=bars[-1].timestamp,
open=bars[0].open,
high=max(b.high for b in bars),
low=min(b.low for b in bars),
close=bars[-1].close,
volume=sum(b.volume for b in bars),
)
Every base bar type can be combined with every aggregation method. Some combinations are standard (calendar time bars = what exchanges give you), others are exotic but powerful.
Combination Examples
| Base Bar Type | Calendar | Rolling | Adaptive |
|---|---|---|---|
| Time | Standard exchange candles | Always-valid HTF, no cold start | Vol-adaptive timeframe |
| Volume | "All volume bars this hour" | Last 24 volume bars | Wider window in calm markets |
| Dollar | Hourly dollar-bar aggregate | Last N dollar bars | Adaptive dollar windows |
| Tick Imbalance | Hourly imbalance aggregate | Last N imbalance events | Fast reaction in volatile regimes |
| Delta | Hourly net order flow | Rolling delta snapshot | Adaptive flow window |
| Renko | "Bricks this hour" | Last N bricks | Adaptive brick count |
Hybrid Engine: Calendar + Rolling
In practice, you want both calendar and rolling aggregation simultaneously. The memory overhead is negligible — two deque buffers per timeframe per symbol.
class HybridCandleEngine:
"""
Maintains both calendar-aligned and rolling candles
for any base bar type.
Calendar candles: for market structure, support/resistance, PIQ.
Rolling candles: for indicators, signal generation, entries/exits.
"""
def __init__(self):
self.rolling = {
'1h': RollingCandleAggregator(60),
'4h': RollingCandleAggregator(240),
}
self.calendar: dict[str, list[OHLCV]] = {
'1h': [],
'4h': [],
}
self._calendar_buffer: dict[str, list[OHLCV]] = {
'1h': [],
'4h': [],
}
def on_bar(self, bar: OHLCV):
"""Process any base bar type — time, volume, tick, delta, etc."""
rolling_results = {}
for tf, agg in self.rolling.items():
rolling_results[tf] = agg.push(bar)
self._update_calendar(bar)
return rolling_results
def _update_calendar(self, bar: OHLCV):
from datetime import datetime
ts = datetime.utcfromtimestamp(bar.timestamp)
for tf, minutes in [('1h', 60), ('4h', 240)]:
self._calendar_buffer[tf].append(bar)
total_minutes = ts.hour * 60 + ts.minute
if (total_minutes + 1) % minutes == 0:
bars = self._calendar_buffer[tf]
if bars:
agg = OHLCV(
timestamp=bars[-1].timestamp,
open=bars[0].open,
high=max(b.high for b in bars),
low=min(b.low for b in bars),
close=bars[-1].close,
volume=sum(b.volume for b in bars),
)
self.calendar[tf].append(agg)
self._calendar_buffer[tf] = []
Time-Volume Hybrid: Calendar with Volume Splits
A special aggregation variant: calendar-aligned candles that force-close early when volume exceeds a threshold. Maintains time synchronization while adapting to activity spikes.
class TimeVolumeHybridGenerator:
"""
Calendar-aligned candles that split when volume spikes.
Rule: close the candle at the calendar boundary OR when
accumulated volume exceeds vol_threshold, whichever comes first.
Works with any base bar type — the volume trigger adds an
extra split dimension on top of calendar alignment.
"""
def __init__(
self,
interval_minutes: int = 60,
vol_threshold: float = 5000.0,
):
self.interval_minutes = interval_minutes
self.vol_threshold = vol_threshold
self.buffer: list[OHLCV] = []
self.accumulated_volume = 0.0
self.bars: list[OHLCV] = []
def on_bar(self, bar: OHLCV) -> OHLCV | None:
self.buffer.append(bar)
self.accumulated_volume += bar.volume
from datetime import datetime
ts = datetime.utcfromtimestamp(bar.timestamp)
total_minutes = ts.hour * 60 + ts.minute
at_boundary = (total_minutes + 1) % self.interval_minutes == 0
vol_spike = self.accumulated_volume >= self.vol_threshold
if at_boundary or vol_spike:
return self._close_bar(split_reason='volume' if vol_spike else 'time')
return None
def _close_bar(self, split_reason: str) -> OHLCV:
bars = self.buffer
bar = OHLCV(
timestamp=bars[-1].timestamp,
open=bars[0].open,
high=max(b.high for b in bars),
low=min(b.low for b in bars),
close=bars[-1].close,
volume=sum(b.volume for b in bars),
)
bar.split_reason = split_reason # type: ignore
bar.num_bars = len(bars) # type: ignore
self.bars.append(bar)
self.buffer = []
self.accumulated_volume = 0.0
return bar
Practical Aggregation: Cascading Preload
Cascading preload: composing daily candles from hourly, and hourly from minute — bypassing API limits.
Exchanges limit how much historical data they serve. Binance gives ~1000 candles per REST request, OKX caps at 300. If you need a rolling 1D candle (1440 minutes), you can't always get enough 1m history. For real-time streaming of trades and order books via WebSocket, see CCXT Pro WebSocket Methods.
The solution: cascading aggregation — build higher timeframes from the highest resolution available at each depth, then stitch them together.
Rolling 1W candle:
├── 6 completed 1D candles ← fetch from REST /klines?interval=1d
├── 1 partial day:
│ ├── 23 completed 1h candles ← fetch from REST /klines?interval=1h
│ └── 1 partial hour:
│ └── N completed 1m candles ← fetch from REST /klines?interval=1m
└── Live: each new closed 1m candle updates the entire chain
This works because OHLCV aggregation is composable: the high of a 1D candle is the max of 24 1h highs, which is the max of 1440 1m highs.
Multi-Exchange Limits
| Exchange | Max 1m Candles | Max 1h Candles | Notable Intervals |
|---|---|---|---|
| Binance | 1,000 | 1,000 | 1m–1M, full range |
| Bybit | 1,000 | 1,000 | 1–720, D/W/M |
| OKX | 300 | 300 | 1m–1M (more restrictive) |
| Gate.io | 1,000 | 1,000 | 10s–30d |
Aggregation Consistency Check
The 1h candle from a REST API might not match what you'd compute from 60 1m candles. Always validate:
def validate_aggregation(
candle_htf: OHLCV,
candles_ltf: list[OHLCV],
tolerance_pct: float = 0.001,
) -> dict[str, bool]:
agg = OHLCV(
timestamp=candles_ltf[-1].timestamp,
open=candles_ltf[0].open,
high=max(c.high for c in candles_ltf),
low=min(c.low for c in candles_ltf),
close=candles_ltf[-1].close,
volume=sum(c.volume for c in candles_ltf),
)
def close_enough(a: float, b: float) -> bool:
if a == 0 and b == 0:
return True
return abs(a - b) / max(abs(a), abs(b)) < tolerance_pct
return {
'open': close_enough(candle_htf.open, agg.open),
'high': close_enough(candle_htf.high, agg.high),
'low': close_enough(candle_htf.low, agg.low),
'close': close_enough(candle_htf.close, agg.close),
'volume': close_enough(candle_htf.volume, agg.volume),
}
If validation fails consistently, always aggregate from 1m yourself — never trust the exchange's HTF candle for backtesting parity.
Comparison Matrix
Axis 1: Base Bar Types
| # | Bar Type | Trigger | Tick Data Required | Best For |
|---|---|---|---|---|
| 1 | Time | Fixed interval | No | Market structure, crowd behavior |
| 2 | Tick | N trades | Yes | ML features, equal-opinion sampling |
| 3 | Volume | N units traded | Yes | Normalized activity analysis |
| 4 | Dollar | $N notional | Yes | Cross-asset comparison |
| 5 | Renko | Price ± N units | No | Trend following, noise filtering |
| 6 | Range | High-Low ≥ N | Yes | Breakout detection |
| 7 | Volatility | Adaptive range | Yes | Regime-adaptive analysis |
| 8 | Heikin-Ashi | Transformation | No | Trend confirmation (synthetic prices!) |
| 9 | Kagi | Price reversal | No | Supply/demand structure |
| 10 | Line Break | N-line breakout | No | Macro trend filter |
| 11 | Point & Figure | Box + reversal | No | Support/resistance mapping |
| 12 | TIB | Tick imbalance | Yes | Informed flow detection |
| 13 | VIB | Volume imbalance | Yes | Large order detection |
| 14 | Run | Run length | Yes | Order splitting detection |
| 15 | CUSUM | Cumulative return | No (1m closes) | Structural break events |
| 16 | Entropy | Shannon entropy | Yes | ML research, feature purity |
| 17 | Delta | Order flow delta | Yes (aggTrades) | Aggressor flow analysis |
Axis 2: Aggregation Methods
| Method | Alignment | Cold Start | Phase Shift | Best For |
|---|---|---|---|---|
| Calendar | Wall clock | Partial bar risk | None (crowd-aligned) | Market structure, PIQ, S/R |
| Rolling | N bars | None (after warmup) | Yes (shifted from :00) | Indicators, signals |
| Adaptive | Volatility-driven N | After ATR calibration | Yes | Vol-adaptive strategies |
Practical Recommendations
Four-layer candle architecture: rolling signals, calendar structure, microstructure flow, and trend filters.
If your backtest engine runs on 1m OHLCV data:
- Rolling time bars — simplest upgrade. Zero additional data. Eliminates cold start.
- Hybrid (rolling + calendar) time bars — calendar for market structure, rolling for signals.
- CUSUM filter — works on 1m closes, no tick data. "Something moved enough to be interesting."
If you have tick/trade data:
- Dollar bars + rolling — recommended default from quant finance literature.
- Volume imbalance bars + rolling — detects informed flow, samples more during significant events.
- Delta bars + calendar — if you have aggressor-side classification, the most direct view of who is pushing the market.
As filters (apply Heikin-Ashi or Line Break on top of any base+aggregation combination):
- Heikin-Ashi over rolling volume bars — clean trend signals on activity-normalized data.
- Line Break / Kagi over daily calendar bars — macro trend filter.
For Marketmaker.cc specifically — a layered approach:
- Layer 1 (signals): Rolling aggregation of time bars for indicators and entry/exit signals. No cold start, perfect backtest parity.
- Layer 2 (market structure): Calendar-aligned time bars for support/resistance, hourly close analysis, and PIQ triggers.
- Layer 3 (microstructure): Volume imbalance bars + delta bars from the raw trade stream for detecting informed flow, order splitting, and anticipating large moves. See also Digital Fingerprint: Trader Identification for behavioral pattern recognition on order flow data.
- Layer 4 (trend filter): Heikin-Ashi transformation on rolling bars, or Line Break on 4h calendar closes, to keep signals aligned with macro direction.
Conclusion
Candle construction is not a single choice — it's two independent decisions:
-
What kind of bar? Time captures clock intervals. Activity (tick, volume, dollar) captures market participation. Price (Renko, range, volatility) captures movements. Information (imbalance, runs, CUSUM, entropy) captures arrivals of new information. Order flow (delta) captures aggressive pressure.
-
How to aggregate into higher timeframes? Calendar aligns with the crowd. Rolling eliminates cold start. Adaptive reacts to volatility.
The standard "1-hour candle from Binance" is just one cell in a 17×3 matrix. The other 50 combinations are available to anyone willing to implement them. For a production system, the answer is "pick the right combination for each layer of your decision engine."
The atomic unit — the closed base bar — remains the foundation. Everything else is aggregation.
For more on backtest accuracy with fine-grained data, see Adaptive Drill-Down: Backtest with Variable Granularity. For the impact of indicator precomputation on multi-timeframe strategies, see Aggregated Parquet Cache.
Useful Links
- Lopez de Prado — Advances in Financial Machine Learning (2018)
- Easley, Lopez de Prado, O'Hara — The Volume Clock: Insights into the High Frequency Paradigm (2012)
- mlfinlab — Python library implementing information-driven bars
- Binance — Historical Market Data
- Apache Parquet — columnar storage format
Citation
@article{soloviov2026bartypes,
author = {Soloviov, Eugen},
title = {Bar Types and Aggregation Methods for Algorithmic Trading},
year = {2026},
url = {https://marketmaker.cc/en/blog/post/beyond-time-bars-candle-construction},
description = {Two-axis classification of candle construction: 17 base bar types × 3 aggregation methods = 51 combinations, with implementation code and practical recommendations for crypto algotrading.}
}
MarketMaker.cc Team
Quantitative Research & Strategy