How every number is computed. This file is the spec; factors.py and
analytics.py implement it. If the code changes, change this file in the
same commit. Transparency is a product feature: Barra and the S&P factor
indices are black boxes, this is not.
S&P 500 constituents, point-in-time. Membership at any past date is reconstructed from FMP's historical add/remove event feed by walking events backward from the current 503-name list. A stock acquired in January 2026 is therefore in the universe for rebalances before its removal date and out afterward — no survivorship bias from using today's list, to the extent the event feed is accurate.
Secondary share classes (GOOG, FOX, NWS) are dropped: FMP reports the full company market cap on both lines, so keeping both double-counts the company (Alphabet was 2x weighted before this fix). The primary line (GOOGL, FOXA, NWSA) carries the company.
Known limits:
data/raw/meta.json).All from FMP (Premium tier), pulled by ingest.py:
| Dataset | Endpoint | Notes |
|---|---|---|
| Prices | historical-price-eod/dividend-adjusted | split and dividend adjusted close → returns are total returns. The default full endpoint is unadjusted — do not use it for returns. |
| Income statement | income-statement (quarterly, 20q) | includes filingDate — basis for point-in-time visibility |
| Balance sheet | balance-sheet-statement (quarterly, 20q) | includes filingDate |
| Cash flow | cash-flow-statement (quarterly, 20q) | operating cash flow for the accruals component |
| Market cap, ROE | key-metrics (quarterly, 20q) | no filing date; joined to income-statement quarters |
| Dividends | dividends | per-share cash dividends by ex-date |
| Membership events | historical-sp500-constituent | see §1 |
Point-in-time rule: at a rebalance date t, a quarterly report is
visible only if its filingDate ≤ t. No look-ahead into not-yet-filed
quarters. TTM aggregates use the last 4 visible quarters.
Market cap at t: last reported quarter's market cap, drifted to t by the adjusted-price ratio. (Drifting with total-return prices slightly overstates cap for high-yield names between reports; immaterial for quintile ranking.)
Scores are computed cross-sectionally over the point-in-time universe at each rebalance. Raw metrics are winsorized at 2.5%/97.5%, then z-scored. Composites average the available component z-scores (a name missing one component is scored on the rest; missing all → excluded from that factor's portfolios).
| Factor | Definition (higher score = stronger membership) |
|---|---|
| Momentum | 12-1 month total return: P(t-21d) / P(t-252d) - 1 (skips the most recent month, standard reversal exclusion) |
| Value | mean z of: E/P (TTM net income / mktcap), B/P (common equity / mktcap, equity>0 only), S/P (TTM revenue / mktcap) |
| Quality | mean z of: ROE (TTM net income / avg of current & year-ago equity), −D/E (total debt / equity), −earnings variability (std of YoY quarterly diluted-EPS growth, last 12 obs, min 8) |
| Size | −log(market cap) — "small within the S&P 500" |
| Low volatility | −std of daily total returns, trailing 252d (min 200 obs) |
| High beta | OLS beta vs SPY, trailing 252d daily returns |
| Dividend yield | TTM dividends per share (by ex-date) / price |
| EPS revisions | not yet computed — requires ≥ ~1 month of daily consensus snapshots from collect_revisions.py; see §8 |
At each month-end rebalance (last trading day):
1. Rank scored names; split into quintiles (~100 names each).
2. Long-only factor series (<factor>_long): top quintile,
cap-weighted with a 5% single-name cap (excess redistributed
pro-rata, S&P-style). Without the cap, the quality and momentum longs
degenerate into a mega-tech beta bet (6 names ≈ 60% of weight) and stop
measuring the factor — this showed up directly in validation. This is
the comparable to the S&P factor indices / factor ETFs and feeds the
quilt.
3. Spread series (<factor>_spread): top quintile minus bottom
quintile, both equal-weighted (academic convention; cap-weighted
spreads are dominated by megacaps).
4. Benchmark (bench): full universe, cap-weighted.
5. Sectors (sector_*): cap-weighted within each GICS sector
(current-constituent mapping; sector history not reconstructed).
Between rebalances portfolios are buy-and-hold: weights drift with prices; no daily re-ranking (daily re-ranking is the classic way to fake a factor series and leak turnover-free alpha). A name whose prices stop mid-month (delisting/acquisition) is frozen at its last price — economically a cash-out reinvested pro-rata at next rebalance.
Daily series start 2024-06-30 (needs 12m momentum lookback; price history pulled from 2023-01-01).
For each factor, two relative series:
rel: long-only minus benchmark (what an ETF-vs-SPY watcher sees)spread: the Q5−Q1 series (the cleaner factor signal)For horizons 1d / 5d / 20d: current h-day compounded return compared to the trailing 252 overlapping h-day returns of the same series (current excluded). Report z-score and percentile (returns are fat-tailed; the percentile keeps the z honest). |z| ≥ 2 is the headline flag.
Caveat by design: with overlapping windows the baseline observations are autocorrelated — fine for "is this unusual?", not a t-test.
On trailing-20d returns of the long-only series:
Calendar-month total returns of the long-only factor series + benchmark, ranked best→worst per month, trailing 13 months. Monthly granularity only — a daily quilt reshuffles too much to read; daily action lives in the spread monitor.
collect_revisions.py snapshots consensus EPS/revenue (avg/high/low,
analyst counts) for all constituents daily. **This series cannot be
backfilled — non-Enterprise vendors don't sell daily revision history —
which is exactly why it compounds into a moat.** Snapshots are committed to
git because they are irreplaceable. Planned outputs once ≥1 month
accumulates: net up/down revision breadth across the index; revision
leaders-vs-laggards spread (the Counterpoint-style edge).
Our own series is too short for seasonality — 2 years gives n=2 per
calendar month, which is astrology. So seasonality.py uses the Ken French
library monthly factor returns (momentum to 1927, HML/SMB/RMW to 1963),
free: mean/median return and hit rate per calendar month, full history and
trailing 30y. The total US market (Mkt-RF + RF; CRSP all-US, labeled "US
market", not strictly the S&P 500) is included as a reference row. The
dashboard renders the full factor × month grid as a mosaic; the digest
uses the current month's baselines. The snapshot carries the current month's baseline next to the
live monitor ("June is historically momentum's strongest month, 70% hit
rate — and it's currently -2.4σ"). Definition mismatch (French universe ≠
S&P 500 quintiles) is acceptable for a seasonal-baseline view and disclosed
in the payload. Mapping: momentum→UMD, value→HML, size→SMB,
quality→RMW(profitability); low vol / dividend yield / high beta have no
French analogue and are omitted.
validate.py checks the computed series against two independent references
(report: data/derived/validation.md):
1. Published S&P 500 factor index monthly returns (Invesco dashboard
quilt, Bloomberg-sourced, May 2025–Apr 2026): per-factor correlation,
sign agreement, mean abs difference; per-month quilt rank correlation.
2. Factor ETFs (SPMO, RPV, SPHQ, RSP, SPLV, SPHD, SPHB vs SPY): daily
return correlation, absolute and benchmark-relative.
Exact agreement is not expected (S&P indices are ~100-name score-weighted
baskets; RPV is style-weighted "pure value"). What must hold: high
correlation, consistent sign, same ordering most months. If a change to
factors.py moves these checks materially, that's a regression until
explained.
Current status (2026-06): monthly corr vs published 0.90-1.00 across all factors, MAD 0.6-2.0pp; daily corr vs ETFs 0.87-1.00. Weakest link is quality's benchmark-relative correlation vs SPHQ (~0.5).
Tested and rejected (2026-06): S&P-style accruals in quality. Swapping
earnings variability for NI−OCF/assets accruals regressed every quality
check (monthly corr 0.91→0.76, relative corr vs SPHQ 0.51→0.02); a
4-component mix was also worse (0.79). Probable cause: NI−OCF accruals are
meaningless for financials (~15% of the universe) and FMP quarterly OCF is
noisy. The accruals metric is still computed in factors.py for future
experiments (e.g. excluding financials) but is not in the composite.
Methodology changes must beat the validation harness to ship.
factors.py also writes data/derived/breadth_daily.csv: each day, the
share of point-in-time index members trading above their own 50-day and
200-day moving averages (equal count, secondary share classes excluded,
names without enough history for the MA dropped from that day's
denominator). Used as a participation check on factor moves — a factor
rally with collapsing breadth is a different animal from a broad one.
Validation guardrail: validate.py exits non-zero (failing the daily
run) if any factor's monthly correlation vs the published indices drops
below 0.75 or its daily ETF correlation below 0.80 — comfortably under
the current 0.90–1.00 / 0.87–1.00 levels, so a trip means a real
regression.
Definitions live in baskets/*.json, one file per basket: thesis,
members (each with an added date and a written rationale, plus a
removed date when dropped), and a changelog. Membership is curated by
the maintaining agent; **every add/drop is dated and justified in the
changelog — the audit trail is part of the product.**
Construction (baskets.py): equal-weighted across members active at
each month-end rebalance, buy-and-hold between rebalances — the same
discipline as the factor portfolios (no daily re-ranking, frozen prices
on delisting). Members without price data are excluded from the series
and surfaced in the dashboard payload for review. Series before a
basket's creation date are a **backtest of the membership as of
creation**; live tracking starts at creation. Benchmark-relative figures
compound the basket and the computed cap-weighted benchmark over the
same window.
Computed series: 2024-08-01 to 2026-06-11. References: S&P 500 factor index monthly returns (Invesco dashboard quilt, as of 2026-04-30) and factor ETF total returns (FMP, dividend-adjusted).
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +5.0% | +11.4% | +11.4% |
| 2025-06 | +2.2% | +7.0% | +6.9% |
| 2025-07 | +0.0% | +2.9% | +2.9% |
| 2025-08 | +0.4% | +0.7% | +0.6% |
| 2025-09 | +6.2% | +4.1% | +4.2% |
| 2025-10 | +1.0% | +0.5% | +0.6% |
| 2025-11 | -1.5% | -1.3% | -1.3% |
| 2025-12 | +0.2% | -0.4% | -0.4% |
| 2026-01 | +4.8% | +0.5% | +0.4% |
| 2026-02 | +1.6% | -0.3% | -0.3% |
| 2026-03 | -6.0% | -5.9% | -5.8% |
| 2026-04 | +18.7% | +19.3% | +19.3% |
Monthly corr vs published: 0.90 | sign agreement: 83% | mean abs diff: 2.0pp
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +4.2% | +2.4% | +2.6% |
| 2025-06 | +5.3% | +4.1% | +4.0% |
| 2025-07 | -1.3% | -1.8% | -1.8% |
| 2025-08 | +7.6% | +6.6% | +6.4% |
| 2025-09 | +1.9% | +2.1% | +2.1% |
| 2025-10 | -1.5% | -0.3% | -0.3% |
| 2025-11 | +3.7% | +3.5% | +3.6% |
| 2025-12 | +1.8% | +1.3% | +1.3% |
| 2026-01 | +1.8% | +3.8% | +3.9% |
| 2026-02 | +3.6% | +4.7% | +4.7% |
| 2026-03 | -3.0% | -3.8% | -3.9% |
| 2026-04 | +6.2% | +3.6% | +3.7% |
Monthly corr vs published: 0.91 | sign agreement: 100% | mean abs diff: 1.1pp
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +5.2% | +6.3% | +6.2% |
| 2025-06 | +3.7% | +1.6% | +1.7% |
| 2025-07 | -0.1% | +0.2% | +0.1% |
| 2025-08 | +2.5% | +1.4% | +1.3% |
| 2025-09 | +2.1% | +1.5% | +1.6% |
| 2025-10 | +1.0% | +1.0% | +1.1% |
| 2025-11 | +1.9% | +0.9% | +0.9% |
| 2025-12 | -0.1% | +0.7% | +0.7% |
| 2026-01 | +1.3% | +3.1% | +3.1% |
| 2026-02 | +1.1% | +4.6% | +4.7% |
| 2026-03 | -6.5% | -6.7% | -6.8% |
| 2026-04 | +9.2% | +7.8% | +7.8% |
Monthly corr vs published: 0.91 | sign agreement: 83% | mean abs diff: 1.2pp
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +4.1% | +4.3% | +4.3% |
| 2025-06 | +1.7% | +3.4% | +3.4% |
| 2025-07 | +0.9% | +1.0% | +1.0% |
| 2025-08 | +4.2% | +2.7% | +2.7% |
| 2025-09 | -1.1% | +1.0% | +1.1% |
| 2025-10 | -2.5% | -0.9% | -0.9% |
| 2025-11 | +3.5% | +1.9% | +1.9% |
| 2025-12 | +0.3% | +0.4% | +0.4% |
| 2026-01 | +3.6% | +3.4% | +3.4% |
| 2026-02 | +3.1% | +3.5% | +3.5% |
| 2026-03 | -6.9% | -6.0% | -6.0% |
| 2026-04 | +4.0% | +6.0% | +6.0% |
Monthly corr vs published: 0.93 | sign agreement: 92% | mean abs diff: 1.0pp
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +1.3% | +1.0% | +1.1% |
| 2025-06 | -1.2% | -0.7% | -0.8% |
| 2025-07 | -0.5% | -0.3% | -0.3% |
| 2025-08 | +3.1% | +1.6% | +1.6% |
| 2025-09 | -0.6% | +0.2% | +0.2% |
| 2025-10 | -3.4% | -3.7% | -3.7% |
| 2025-11 | +3.2% | +3.8% | +3.9% |
| 2025-12 | -1.4% | -2.2% | -2.2% |
| 2026-01 | +3.7% | +3.3% | +3.3% |
| 2026-02 | +5.9% | +5.3% | +5.4% |
| 2026-03 | -5.8% | -5.3% | -5.3% |
| 2026-04 | +1.5% | +2.0% | +2.0% |
Monthly corr vs published: 0.98 | sign agreement: 92% | mean abs diff: 0.6pp
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +1.2% | +0.4% | +0.4% |
| 2025-06 | +1.8% | +0.4% | +0.4% |
| 2025-07 | +0.6% | +0.6% | +0.6% |
| 2025-08 | +6.0% | +4.1% | +4.2% |
| 2025-09 | -0.9% | +0.3% | +0.4% |
| 2025-10 | -1.9% | -3.9% | -4.0% |
| 2025-11 | +3.8% | +3.2% | +3.2% |
| 2025-12 | +0.5% | -0.9% | -0.9% |
| 2026-01 | +6.9% | +5.2% | +5.2% |
| 2026-02 | +5.2% | +4.7% | +4.7% |
| 2026-03 | -3.1% | -5.0% | -5.0% |
| 2026-04 | +3.5% | +2.0% | +2.1% |
Monthly corr vs published: 0.95 | sign agreement: 83% | mean abs diff: 1.2pp
| Month | Ours | ETF | Published index |
|---|---|---|---|
| 2025-05 | +6.1% | +6.3% | +6.3% |
| 2025-06 | +5.0% | +5.1% | +5.1% |
| 2025-07 | +2.3% | +2.3% | +2.2% |
| 2025-08 | +2.0% | +2.1% | +2.0% |
| 2025-09 | +3.7% | +3.6% | +3.6% |
| 2025-10 | +2.2% | +2.4% | +2.3% |
| 2025-11 | +0.3% | +0.2% | +0.2% |
| 2025-12 | +0.0% | +0.1% | +0.1% |
| 2026-01 | +1.4% | +1.5% | +1.5% |
| 2026-02 | -0.9% | -0.9% | -0.8% |
| 2026-03 | -5.0% | -4.9% | -5.0% |
| 2026-04 | +10.5% | +10.5% | +10.5% |
Monthly corr vs published: 1.00 | sign agreement: 100% | mean abs diff: 0.1pp
| Factor | ETF | Corr (daily) | Corr (relative-to-SPY daily) |
|---|---|---|---|
| momentum | SPMO | 0.944 | 0.824 |
| value | RPV | 0.882 | 0.880 |
| quality | SPHQ | 0.875 | 0.518 |
| size | RSP | 0.894 | 0.909 |
| lowvol | SPLV | 0.937 | 0.970 |
| divyield | SPHD | 0.921 | 0.953 |
| highbeta | SPHB | 0.943 | 0.836 |
| bench | SPY | 0.998 | nan |
| Month | Rank corr |
|---|---|
| 2025-05 | 0.86 |
| 2025-06 | 0.61 |
| 2025-07 | 0.79 |
| 2025-08 | 0.89 |
| 2025-09 | 0.82 |
| 2025-10 | 0.86 |
| 2025-11 | 0.75 |
| 2025-12 | 0.39 |
| 2026-01 | 0.29 |
| 2026-02 | 0.78 |
| 2026-03 | 0.95 |
| 2026-04 | 0.96 |