Factor Watch — Methodology

How every number is computed. This file is the spec; factors.py and analytics.py implement it. If the code changes, change this file in the same commit. Transparency is a product feature: Barra and the S&P factor indices are black boxes, this is not.

1. Universe

S&P 500 constituents, point-in-time. Membership at any past date is reconstructed from FMP's historical add/remove event feed by walking events backward from the current 503-name list. A stock acquired in January 2026 is therefore in the universe for rebalances before its removal date and out afterward — no survivorship bias from using today's list, to the extent the event feed is accurate.

Secondary share classes (GOOG, FOX, NWS) are dropped: FMP reports the full company market cap on both lines, so keeping both double-counts the company (Alphabet was 2x weighted before this fix). The primary line (GOOGL, FOXA, NWSA) carries the company.

Known limits:

2. Data

All from FMP (Premium tier), pulled by ingest.py:

DatasetEndpointNotes
Priceshistorical-price-eod/dividend-adjustedsplit and dividend adjusted close → returns are total returns. The default full endpoint is unadjusted — do not use it for returns.
Income statementincome-statement (quarterly, 20q)includes filingDate — basis for point-in-time visibility
Balance sheetbalance-sheet-statement (quarterly, 20q)includes filingDate
Cash flowcash-flow-statement (quarterly, 20q)operating cash flow for the accruals component
Market cap, ROEkey-metrics (quarterly, 20q)no filing date; joined to income-statement quarters
Dividendsdividendsper-share cash dividends by ex-date
Membership eventshistorical-sp500-constituentsee §1

Point-in-time rule: at a rebalance date t, a quarterly report is visible only if its filingDatet. No look-ahead into not-yet-filed quarters. TTM aggregates use the last 4 visible quarters.

Market cap at t: last reported quarter's market cap, drifted to t by the adjusted-price ratio. (Drifting with total-return prices slightly overstates cap for high-yield names between reports; immaterial for quintile ranking.)

3. Factor definitions

Scores are computed cross-sectionally over the point-in-time universe at each rebalance. Raw metrics are winsorized at 2.5%/97.5%, then z-scored. Composites average the available component z-scores (a name missing one component is scored on the rest; missing all → excluded from that factor's portfolios).

FactorDefinition (higher score = stronger membership)
Momentum12-1 month total return: P(t-21d) / P(t-252d) - 1 (skips the most recent month, standard reversal exclusion)
Valuemean z of: E/P (TTM net income / mktcap), B/P (common equity / mktcap, equity>0 only), S/P (TTM revenue / mktcap)
Qualitymean z of: ROE (TTM net income / avg of current & year-ago equity), −D/E (total debt / equity), −earnings variability (std of YoY quarterly diluted-EPS growth, last 12 obs, min 8)
Size−log(market cap) — "small within the S&P 500"
Low volatility−std of daily total returns, trailing 252d (min 200 obs)
High betaOLS beta vs SPY, trailing 252d daily returns
Dividend yieldTTM dividends per share (by ex-date) / price
EPS revisionsnot yet computed — requires ≥ ~1 month of daily consensus snapshots from collect_revisions.py; see §8

4. Portfolio construction

At each month-end rebalance (last trading day):

1. Rank scored names; split into quintiles (~100 names each). 2. Long-only factor series (<factor>_long): top quintile, cap-weighted with a 5% single-name cap (excess redistributed pro-rata, S&P-style). Without the cap, the quality and momentum longs degenerate into a mega-tech beta bet (6 names ≈ 60% of weight) and stop measuring the factor — this showed up directly in validation. This is the comparable to the S&P factor indices / factor ETFs and feeds the quilt. 3. Spread series (<factor>_spread): top quintile minus bottom quintile, both equal-weighted (academic convention; cap-weighted spreads are dominated by megacaps). 4. Benchmark (bench): full universe, cap-weighted. 5. Sectors (sector_*): cap-weighted within each GICS sector (current-constituent mapping; sector history not reconstructed).

Between rebalances portfolios are buy-and-hold: weights drift with prices; no daily re-ranking (daily re-ranking is the classic way to fake a factor series and leak turnover-free alpha). A name whose prices stop mid-month (delisting/acquisition) is frozen at its last price — economically a cash-out reinvested pro-rata at next rebalance.

Daily series start 2024-06-30 (needs 12m momentum lookback; price history pulled from 2023-01-01).

5. Spread monitor (z-scores)

For each factor, two relative series:

For horizons 1d / 5d / 20d: current h-day compounded return compared to the trailing 252 overlapping h-day returns of the same series (current excluded). Report z-score and percentile (returns are fat-tailed; the percentile keeps the z honest). |z| ≥ 2 is the headline flag.

Caveat by design: with overlapping windows the baseline observations are autocorrelated — fine for "is this unusual?", not a t-test.

6. Rotation detector

On trailing-20d returns of the long-only series:

7. Performance quilt

Calendar-month total returns of the long-only factor series + benchmark, ranked best→worst per month, trailing 13 months. Monthly granularity only — a daily quilt reshuffles too much to read; daily action lives in the spread monitor.

8. EPS revisions (V2, collection running now)

collect_revisions.py snapshots consensus EPS/revenue (avg/high/low, analyst counts) for all constituents daily. **This series cannot be backfilled — non-Enterprise vendors don't sell daily revision history — which is exactly why it compounds into a moat.** Snapshots are committed to git because they are irreplaceable. Planned outputs once ≥1 month accumulates: net up/down revision breadth across the index; revision leaders-vs-laggards spread (the Counterpoint-style edge).

9. Factor seasonality

Our own series is too short for seasonality — 2 years gives n=2 per calendar month, which is astrology. So seasonality.py uses the Ken French library monthly factor returns (momentum to 1927, HML/SMB/RMW to 1963), free: mean/median return and hit rate per calendar month, full history and trailing 30y. The total US market (Mkt-RF + RF; CRSP all-US, labeled "US market", not strictly the S&P 500) is included as a reference row. The dashboard renders the full factor × month grid as a mosaic; the digest uses the current month's baselines. The snapshot carries the current month's baseline next to the live monitor ("June is historically momentum's strongest month, 70% hit rate — and it's currently -2.4σ"). Definition mismatch (French universe ≠ S&P 500 quintiles) is acceptable for a seasonal-baseline view and disclosed in the payload. Mapping: momentum→UMD, value→HML, size→SMB, quality→RMW(profitability); low vol / dividend yield / high beta have no French analogue and are omitted.

10. Validation

validate.py checks the computed series against two independent references (report: data/derived/validation.md): 1. Published S&P 500 factor index monthly returns (Invesco dashboard quilt, Bloomberg-sourced, May 2025–Apr 2026): per-factor correlation, sign agreement, mean abs difference; per-month quilt rank correlation. 2. Factor ETFs (SPMO, RPV, SPHQ, RSP, SPLV, SPHD, SPHB vs SPY): daily return correlation, absolute and benchmark-relative.

Exact agreement is not expected (S&P indices are ~100-name score-weighted baskets; RPV is style-weighted "pure value"). What must hold: high correlation, consistent sign, same ordering most months. If a change to factors.py moves these checks materially, that's a regression until explained.

Current status (2026-06): monthly corr vs published 0.90-1.00 across all factors, MAD 0.6-2.0pp; daily corr vs ETFs 0.87-1.00. Weakest link is quality's benchmark-relative correlation vs SPHQ (~0.5).

Tested and rejected (2026-06): S&P-style accruals in quality. Swapping earnings variability for NI−OCF/assets accruals regressed every quality check (monthly corr 0.91→0.76, relative corr vs SPHQ 0.51→0.02); a 4-component mix was also worse (0.79). Probable cause: NI−OCF accruals are meaningless for financials (~15% of the universe) and FMP quarterly OCF is noisy. The accruals metric is still computed in factors.py for future experiments (e.g. excluding financials) but is not in the composite. Methodology changes must beat the validation harness to ship.

11. Breadth

factors.py also writes data/derived/breadth_daily.csv: each day, the share of point-in-time index members trading above their own 50-day and 200-day moving averages (equal count, secondary share classes excluded, names without enough history for the MA dropped from that day's denominator). Used as a participation check on factor moves — a factor rally with collapsing breadth is a different animal from a broad one.

Validation guardrail: validate.py exits non-zero (failing the daily run) if any factor's monthly correlation vs the published indices drops below 0.75 or its daily ETF correlation below 0.80 — comfortably under the current 0.90–1.00 / 0.87–1.00 levels, so a trip means a real regression.

12. Thematic baskets

Definitions live in baskets/*.json, one file per basket: thesis, members (each with an added date and a written rationale, plus a removed date when dropped), and a changelog. Membership is curated by the maintaining agent; **every add/drop is dated and justified in the changelog — the audit trail is part of the product.**

Construction (baskets.py): equal-weighted across members active at each month-end rebalance, buy-and-hold between rebalances — the same discipline as the factor portfolios (no daily re-ranking, frozen prices on delisting). Members without price data are excluded from the series and surfaced in the dashboard payload for review. Series before a basket's creation date are a **backtest of the membership as of creation**; live tracking starts at creation. Benchmark-relative figures compound the basket and the computed cap-weighted benchmark over the same window.

Differences vs the big-shop dashboards (deliberate)


Factor Watch validation report

Computed series: 2024-08-01 to 2026-06-11. References: S&P 500 factor index monthly returns (Invesco dashboard quilt, as of 2026-04-30) and factor ETF total returns (FMP, dividend-adjusted).

momentum (ETF ref: SPMO)

MonthOursETFPublished index
2025-05+5.0%+11.4%+11.4%
2025-06+2.2%+7.0%+6.9%
2025-07+0.0%+2.9%+2.9%
2025-08+0.4%+0.7%+0.6%
2025-09+6.2%+4.1%+4.2%
2025-10+1.0%+0.5%+0.6%
2025-11-1.5%-1.3%-1.3%
2025-12+0.2%-0.4%-0.4%
2026-01+4.8%+0.5%+0.4%
2026-02+1.6%-0.3%-0.3%
2026-03-6.0%-5.9%-5.8%
2026-04+18.7%+19.3%+19.3%

Monthly corr vs published: 0.90 | sign agreement: 83% | mean abs diff: 2.0pp

value (ETF ref: RPV)

MonthOursETFPublished index
2025-05+4.2%+2.4%+2.6%
2025-06+5.3%+4.1%+4.0%
2025-07-1.3%-1.8%-1.8%
2025-08+7.6%+6.6%+6.4%
2025-09+1.9%+2.1%+2.1%
2025-10-1.5%-0.3%-0.3%
2025-11+3.7%+3.5%+3.6%
2025-12+1.8%+1.3%+1.3%
2026-01+1.8%+3.8%+3.9%
2026-02+3.6%+4.7%+4.7%
2026-03-3.0%-3.8%-3.9%
2026-04+6.2%+3.6%+3.7%

Monthly corr vs published: 0.91 | sign agreement: 100% | mean abs diff: 1.1pp

quality (ETF ref: SPHQ)

MonthOursETFPublished index
2025-05+5.2%+6.3%+6.2%
2025-06+3.7%+1.6%+1.7%
2025-07-0.1%+0.2%+0.1%
2025-08+2.5%+1.4%+1.3%
2025-09+2.1%+1.5%+1.6%
2025-10+1.0%+1.0%+1.1%
2025-11+1.9%+0.9%+0.9%
2025-12-0.1%+0.7%+0.7%
2026-01+1.3%+3.1%+3.1%
2026-02+1.1%+4.6%+4.7%
2026-03-6.5%-6.7%-6.8%
2026-04+9.2%+7.8%+7.8%

Monthly corr vs published: 0.91 | sign agreement: 83% | mean abs diff: 1.2pp

size (ETF ref: RSP)

MonthOursETFPublished index
2025-05+4.1%+4.3%+4.3%
2025-06+1.7%+3.4%+3.4%
2025-07+0.9%+1.0%+1.0%
2025-08+4.2%+2.7%+2.7%
2025-09-1.1%+1.0%+1.1%
2025-10-2.5%-0.9%-0.9%
2025-11+3.5%+1.9%+1.9%
2025-12+0.3%+0.4%+0.4%
2026-01+3.6%+3.4%+3.4%
2026-02+3.1%+3.5%+3.5%
2026-03-6.9%-6.0%-6.0%
2026-04+4.0%+6.0%+6.0%

Monthly corr vs published: 0.93 | sign agreement: 92% | mean abs diff: 1.0pp

lowvol (ETF ref: SPLV)

MonthOursETFPublished index
2025-05+1.3%+1.0%+1.1%
2025-06-1.2%-0.7%-0.8%
2025-07-0.5%-0.3%-0.3%
2025-08+3.1%+1.6%+1.6%
2025-09-0.6%+0.2%+0.2%
2025-10-3.4%-3.7%-3.7%
2025-11+3.2%+3.8%+3.9%
2025-12-1.4%-2.2%-2.2%
2026-01+3.7%+3.3%+3.3%
2026-02+5.9%+5.3%+5.4%
2026-03-5.8%-5.3%-5.3%
2026-04+1.5%+2.0%+2.0%

Monthly corr vs published: 0.98 | sign agreement: 92% | mean abs diff: 0.6pp

divyield (ETF ref: SPHD)

MonthOursETFPublished index
2025-05+1.2%+0.4%+0.4%
2025-06+1.8%+0.4%+0.4%
2025-07+0.6%+0.6%+0.6%
2025-08+6.0%+4.1%+4.2%
2025-09-0.9%+0.3%+0.4%
2025-10-1.9%-3.9%-4.0%
2025-11+3.8%+3.2%+3.2%
2025-12+0.5%-0.9%-0.9%
2026-01+6.9%+5.2%+5.2%
2026-02+5.2%+4.7%+4.7%
2026-03-3.1%-5.0%-5.0%
2026-04+3.5%+2.0%+2.1%

Monthly corr vs published: 0.95 | sign agreement: 83% | mean abs diff: 1.2pp

bench (ETF ref: SPY)

MonthOursETFPublished index
2025-05+6.1%+6.3%+6.3%
2025-06+5.0%+5.1%+5.1%
2025-07+2.3%+2.3%+2.2%
2025-08+2.0%+2.1%+2.0%
2025-09+3.7%+3.6%+3.6%
2025-10+2.2%+2.4%+2.3%
2025-11+0.3%+0.2%+0.2%
2025-12+0.0%+0.1%+0.1%
2026-01+1.4%+1.5%+1.5%
2026-02-0.9%-0.9%-0.8%
2026-03-5.0%-4.9%-5.0%
2026-04+10.5%+10.5%+10.5%

Monthly corr vs published: 1.00 | sign agreement: 100% | mean abs diff: 0.1pp

Daily return correlation vs ETF analogue (last 252 trading days)

FactorETFCorr (daily)Corr (relative-to-SPY daily)
momentumSPMO0.9440.824
valueRPV0.8820.880
qualitySPHQ0.8750.518
sizeRSP0.8940.909
lowvolSPLV0.9370.970
divyieldSPHD0.9210.953
highbetaSPHB0.9430.836
benchSPY0.998nan

Quilt rank agreement (Spearman, ours vs published, per month)

MonthRank corr
2025-050.86
2025-060.61
2025-070.79
2025-080.89
2025-090.82
2025-100.86
2025-110.75
2025-120.39
2026-010.29
2026-020.78
2026-030.95
2026-040.96