PnL by Active Time: The Metric That Changes Strategy Rankings

You have two strategies. The first: PnL +300%, 418 trades, position open 45% of the time. The second: PnL +27%, 38 trades, position open 5% of the time. Which one is better?

If you chose the first one — you answered incorrectly. Here is why.

The Problem with Raw PnL

Raw PnL — the total return over the entire backtest period — does not account for what fraction of time the strategy was in a position. A strategy with +300% and 45% trading time uses your capital less than half the time. The remaining 55% of the time, capital sits idle.

A strategy with +27% and 5% trading time uses capital only 5% of the time — but the remaining 95% is available for other strategies.

If you run a portfolio of strategies through an orchestrator, one strategy's idle time is filled by others. The key metric then becomes not how much a strategy earned over a year, but how much it earns per unit of active time.

Effective Return Formula

PnL per active day strategy ranking comparison

Basic Calculation

$\text{PnL}_{daily} = \frac{\text{Total PnL}}{\text{Active days}}$

$\text{Annualized}_{raw} = \text{PnL}_{daily} \times 365$

$\text{Annualized}_{effective} = \text{Annualized}_{raw} \times \text{fill\_efficiency}$

where:

Active days — total time in positions (in days)
fill_efficiency — the fraction of time the orchestrator can fill with signals (0...1)

def pnl_per_active_time(
    total_pnl: float,        # total PnL, %
    test_period_days: int,    # backtest length, days
    trading_time_pct: float,  # fraction of active time, 0..1
    fill_efficiency: float = 0.80,  # slot fill efficiency
) -> dict:
    """
    Calculate effective return per active time.
    """
    active_days = test_period_days * trading_time_pct
    pnl_per_day = total_pnl / active_days

    annualized_raw = pnl_per_day * 365
    annualized_effective = annualized_raw * fill_efficiency

    return {
        "active_days": active_days,
        "pnl_per_day": pnl_per_day,
        "annualized_raw": annualized_raw,
        "annualized_effective": annualized_effective,
    }

Recalculating Real Strategies

Period: 750 days (25 months), fill_efficiency = 0.80:

Strategy	PnL	Trading time	Active days	PnL/day	Annualized (x0.8)
Strategy C	+300%	45%	337.5	0.89%/d	259%
Strategy B	+27%	5%	37.5	0.72%/d	210%
Strategy A	+58%	15%	112.5	0.51%/d	150%

By raw PnL: Strategy C (300%) >> Strategy A (58%) >> Strategy B (27%). By effective return: Strategy C (259%) > Strategy B (210%) > Strategy A (150%).

Strategy B with 27% PnL turns out to be comparable to Strategy C with 300% PnL — because it earns the same money in 9 times less active time. The remaining 95% of the time can be filled with other strategies.

Linear vs Compound Extrapolation

The formula above is linear. It is simpler and more conservative. The compound variant accounts for profit reinvestment:

$\text{Daily return (compound)} = (1 + \text{Total PnL})^{1/\text{Active days}} - 1$

$\text{Annualized}_{compound} = (1 + \text{Daily return})^{365 \times \text{fill\_eff}} - 1$

import numpy as np

def compound_annualized(total_pnl_pct, active_days, fill_efficiency=0.80):
    """Compound extrapolation."""
    daily_return = (1 + total_pnl_pct / 100) ** (1 / active_days) - 1
    annualized = (1 + daily_return) ** (365 * fill_efficiency) - 1
    return annualized * 100

b_compound = compound_annualized(27, 37.5)

c_compound = compound_annualized(300, 337.5)

With compound extrapolation, Strategy B overtakes Strategy C: 540% vs 231%. The ranking is inverted.

Recommendation: use linear extrapolation for ranking. It is more conservative and less prone to rewarding overfitting on a small number of trades.

The Trap: Small Number of Trades

Strategy B with 38 trades and PnL/day = 0.72% looks attractive. But 38 trades is a statistically weak sample. A high PnL/day could be the result of a lucky coincidence.

Confidence-adjusted scoring

We use the t-distribution to penalize small samples:

$\text{CI}_{lower} = \bar{r} - t_{\alpha/2, n-1} \times \frac{s}{\sqrt{n}}$

where $\bar{r}$ is the mean return per trade, $s$ is the standard deviation, $n$ is the number of trades, $t_{\alpha/2, n-1}$ is the t-distribution quantile.

import scipy.stats as st
import numpy as np

def confidence_adjusted_score(
    trade_returns: list,
    test_period_days: int,
    fill_efficiency: float = 0.80,
    min_trades: int = 30,
    confidence: float = 0.95,
) -> dict:
    """
    Strategy ranking with sample size adjustment.
    """
    n = len(trade_returns)
    if n < min_trades:
        return {"score": 0, "reason": f"Too few trades ({n} < {min_trades})"}

    returns = np.array(trade_returns)
    mean_ret = np.mean(returns)
    se = np.std(returns, ddof=1) / np.sqrt(n)

    alpha = 1 - confidence
    t_crit = st.t.ppf(1 - alpha / 2, df=n - 1)
    ci_lower = mean_ret - t_crit * se

    if mean_ret <= 0:
        confidence_factor = 0
    else:
        confidence_factor = max(0, ci_lower / mean_ret)

    total_pnl = np.sum(returns)
    hold_times = [...]  # holding hours for each trade
    active_days = sum(hold_times) / 24

    pnl_per_day = total_pnl / active_days if active_days > 0 else 0
    annualized = pnl_per_day * 365 * fill_efficiency


    score = annualized * max_leverage * confidence_factor

    return {
        "score": score,
        "annualized": annualized,
        "confidence_factor": confidence_factor,
        "ci_lower": ci_lower,
        "n_trades": n,
    }

Impact of confidence adjustment

Strategy	Trades	Mean ret	SE	CI lower	Conf. factor	Adjusted score
Strategy B	38	0.71%	0.28%	0.14%	0.20	210% x 0.20 = 42%
Strategy C	418	0.72%	0.05%	0.62%	0.86	259% x 0.86 = 223%
Strategy A	491	0.12%	0.02%	0.08%	0.67	150% x 0.67 = 100%

After confidence adjustment, Strategy C confidently leads: 418 trades give a narrow CI and high confidence factor. Strategy B with 38 trades is penalized — its "brilliant" performance may be the result of variance.

fill_efficiency: Where to Get It

Fill efficiency and orchestrator slot allocation

The fill_efficiency parameter answers the question: "What fraction of time can the orchestrator keep capital working?"

Option 1: Fixed constant

The simplest approach: fill_efficiency = 0.80 for all strategies. Assumes the orchestrator utilizes 80% of idle time with other strategies/pairs.

Pro: identical for all, easy to compare. Con: does not account for correlation between strategies.

Option 2: Analytical estimate

If you have $N$ pairs, each active $p\%$ of the time, the probability that at least one is active:

$P(\geq 1\ \text{active}) = 1 - (1 - p)^N$

But cryptocurrencies are highly correlated — BTC pulls ETH, SOL, and the rest along with it. The effective number of independent pairs:

$N_{eff} = \frac{N}{\text{correlation factor}}$

def estimate_fill_efficiency(
    trading_time_pct: float,
    n_pairs: int,
    correlation_factor: float = 3.0,  # crypto — high correlation
    max_slots: int = 10,
) -> float:
    """
    Analytical estimate of fill_efficiency.

    Args:
        trading_time_pct: fraction of active time for one strategy
        n_pairs: number of trading pairs
        correlation_factor: correlation coefficient (1=independent, 5=strong)
        max_slots: maximum number of simultaneous positions
    """
    effective_n = n_pairs / correlation_factor
    p_at_least_one = 1 - (1 - trading_time_pct) ** effective_n

    expected_active = effective_n * trading_time_pct
    utilization = min(expected_active, max_slots) / max_slots

    return min(p_at_least_one, utilization)

eff_b = estimate_fill_efficiency(0.05, 10, 3.0)

eff_c = estimate_fill_efficiency(0.45, 10, 3.0)

For Strategy B with 5% activity and 10 correlated pairs, fill_efficiency is only ~16%. This dramatically reduces effective return.

Option 3: Simulation from data

The most accurate approach is to run all strategies on all pairs and calculate real slot utilization:

def simulate_fill_efficiency(
    all_signals: dict,  # {(strategy, pair): [(entry_time, exit_time), ...]}
    max_slots: int = 10,
    test_period_minutes: int = 750 * 24 * 60,
) -> float:
    """
    Simulate real orchestrator slot utilization.
    """
    timeline = np.zeros(test_period_minutes)

    for signals in all_signals.values():
        for entry_min, exit_min in signals:
            timeline[entry_min:exit_min] += 1

    capped = np.minimum(timeline, max_slots)
    fill_efficiency = np.mean(capped) / max_slots

    return fill_efficiency

Final Ranking Formula

Combining all components:

def strategy_score(
    trades: list,
    test_period_days: int,
    fill_efficiency: float = 0.80,
    min_trades: int = 30,
    funding_rate: float = 0.0001,
) -> float:
    """
    Final score for strategy ranking.

    Accounts for:
    - PnL per active day (capital usage efficiency)
    - MaxLev (risk-adjusted scaling)
    - Confidence adjustment (penalty for small sample)
    - Funding costs (realistic costs at leverage)
    """
    n = len(trades)
    if n < min_trades:
        return 0

    returns = np.array([t.pnl_pct for t in trades])
    hold_hours = np.array([t.hold_hours for t in trades])

    total_pnl = np.sum(returns)
    active_days = np.sum(hold_hours) / 24
    pnl_per_day = total_pnl / active_days

    equity = np.cumprod(1 + returns / 100)
    peak = np.maximum.accumulate(equity)
    max_dd = ((equity - peak) / peak).min()
    max_lev = max(1, int(50 / abs(max_dd * 100)))

    funding_daily = funding_rate * 3 * max_lev * 100  # in %
    net_pnl_per_day = pnl_per_day - funding_daily

    annualized = net_pnl_per_day * 365 * fill_efficiency

    se = np.std(returns, ddof=1) / np.sqrt(n)
    mean_ret = np.mean(returns)
    if mean_ret <= 0:
        return 0
    t_crit = st.t.ppf(0.975, df=n - 1)
    ci_lower = mean_ret - t_crit * se
    conf_factor = max(0, ci_lower / mean_ret)

    score = annualized * max_lev * conf_factor

    return score

Connection to Other Metrics in the Series

This metric does not replace but complements the tools from previous articles:

Loss-Profit Asymmetry: max drawdown determines MaxLev, which feeds into the score formula. The deeper the drawdown, the lower the score — nonlinearly, due to recovery asymmetry.
Monte Carlo bootstrap: confidence intervals from bootstrap provide a more accurate estimate of the confidence factor than the t-distribution. You can replace the CI from the t-distribution with the 5th percentile from bootstrap.
Funding rates: funding costs are subtracted from PnL per active day. With high leverage and low PnL/day, funding can make the net score negative — the strategy is unprofitable in reality despite a positive raw PnL.

Why This Matters for Orchestration

PnL per active time is the primary metric for ranking strategies in an orchestrator. When multiple strategies compete for the same slot, the one with the highest score (accounting for confidence adjustment) wins.

In practice, this leads to surprising decisions: strategies with "modest" raw PnL but short time in position often get priority over "flashy" strategies with high PnL but long positions. The former use capital more efficiently in a portfolio of dozens of strategies.

The key insight: the only metric that scales is PnL per active day. Raw PnL does not scale: you cannot run the same strategy twice. But you can fill idle time with other strategies — and PnL per active day accurately predicts how much you will earn in a portfolio.

Conclusion

Raw annual PnL is a convenient but deceptive metric. It does not account for the trader's most important resource — the time during which capital is working.

Three takeaways:

Calculate PnL per active day. A strategy with +27% over 38 days in position = +0.72%/day. A strategy with +300% over 338 days = +0.89%/day. The difference is not 11x, but 1.2x.
Account for fill_efficiency. In a portfolio of correlated crypto pairs, fill_efficiency is lower than it seems. 10 pairs does not equal 10x diversification. With correlation_factor = 3, the effective number of pairs is only ~3.
Penalize small samples. 38 trades with a mean of +0.71% gives a CI from +0.14% to +1.28%. 418 trades with +0.72% gives a CI from +0.62% to +0.82%. The second strategy is more reliable, even though the means are nearly identical.

The PnL per active time metric does not replace PnL@MaxLev — it complements it by adding the dimension of capital usage efficiency. For a single strategy, PnL@ML is sufficient. For a portfolio of strategies, PnL per active time is essential.

References

Citation

@article{soloviov2026pnlactivetime,
  author = {Soloviov, Eugen},
  title = {PnL by Active Time: The Metric That Changes Strategy Rankings},
  year = {2026},
  url = {https://marketmaker.cc/ru/blog/post/pnl-active-time-metric},
  version = {0.1.0},
  description = {Why raw annual PnL is a poor metric for comparing strategies with different trading time. How to calculate effective return, why you need fill\_efficiency, and why a strategy with 27\% PnL can outperform one with 300\%.}
}

PnL by Active Time: The Metric That Changes Strategy Rankings

The Problem with Raw PnL

Effective Return Formula

Basic Calculation

Recalculating Real Strategies

Linear vs Compound Extrapolation

The Trap: Small Number of Trades

Confidence-adjusted scoring

Impact of confidence adjustment

fill_efficiency: Where to Get It

Option 1: Fixed constant

Option 2: Analytical estimate

Option 3: Simulation from data

Final Ranking Formula

Connection to Other Metrics in the Series

Why This Matters for Orchestration

Conclusion

References

Citation

Read More

Signal Correlation: How Many Pairs to Monitor

Cascade Strategies: Priority Execution with Fallback Filling

Monte Carlo Bootstrap: How to Get Confidence Intervals for a Backtest in 10 Lines of Code

The Problem with Raw PnL

Effective Return Formula

Basic Calculation

Recalculating Real Strategies

Linear vs Compound Extrapolation

The Trap: Small Number of Trades

Confidence-adjusted scoring

Impact of confidence adjustment

fill_efficiency: Where to Get It

Option 1: Fixed constant

Option 2: Analytical estimate

Option 3: Simulation from data

Final Ranking Formula

Connection to Other Metrics in the Series

Why This Matters for Orchestration

Conclusion

References

Citation

Read More

Signal Correlation: How Many Pairs to Monitor

Cascade Strategies: Priority Execution with Fallback Filling

Monte Carlo Bootstrap: How to Get Confidence Intervals for a Backtest in 10 Lines of Code

Нарықтан бір қадам алда болыңыз

Сәтті!