Risk metrics (Sharpe, Sortino, Calmar) are computed from per-trade P&L,
not daily equity returns. This is a per-trade Sharpe variant appropriate
for strategy-level backtests where daily equity curves are unavailable.
Annualization uses √(trades_per_year), where trades_per_year is derived
from the actual date range of the input data.
Sortino downside deviation uses total sample size N as the denominator,
matching the standard PyFolio/Empyrical implementation.
Calmar uses annualized return divided by maximum drawdown.
Statistical significance is tested with a one-sample t-test on per-trade
P&L against H₀: expectancy = 0.
Monte Carlo simulations use Fisher-Yates shuffle (1,000 iterations).
Standard deviation uses N (population), not N-1 (sample). For samples
above 100 trades the difference is below 0.5%.
Grade labels use a two-stage system: a composite score determines the
provisional grade, then label gates check whether critical metrics
(Sharpe, Sortino, Calmar, sample size) support that grade. Strategies
that score high through weighted averaging but fail gates on key metrics
are demoted.
This tool does not constitute investment advice. It is a statistical
analysis instrument for backtested strategies.