1. What the cascade measures (and what it does not)
The projection cascade produces a per-player season-long fantasy-point projection plus its supporting decomposition. For every active MLB player the cascade outputs:
- Headline
projPoints- the median projection in fantasy points. floorPointsandupsidePoints- the P10 and P90 of the simulated distribution.pctBust- the fraction of simulations scoring below 60% of the median.pctElite- fixed at 8% by construction (the top 8% of simulated outcomes).projectionAudit- a 30-50 field block containing all intermediate values: rate projections, workload tier, talent tier, master confidence, role retention, workload band, Statcast multipliers, lineup-slot context, etc.- Per-category rates -
projK_rate,projER_rate,projH_rate, etc. (pitchers);projTB_rate,projR_rate,projSB_rate, etc. (batters).
Note on population scope
The cascade operates on the active MLB roster - which includes rookies (first-year MLB players who still retain rookie eligibility), MLB-active young players (recently graduated former prospects), and established veterans. It does NOT operate on minor-league prospects who have not yet appeared in MLB; those records are handled by the prospect-shadow model instead.
Per MLB's operational definition, "prospect" status is determined by retained rookie eligibility - a player loses prospect status once they exceed any one of these thresholds:
- Position players: more than 130 MLB at-bats
- Pitchers: more than 50 MLB innings pitched
- More than 45 total active-roster days, excluding IL time and September roster expansion
Source: MLB rookie eligibility rules.
A player on the eligibility cusp may appear in BOTH the cascade and the shadow pipelines during their transition season - this is operationally correct (different model layers serving different scopes), not a duplication bug.
Honest data-coverage disclosure: the cascade does
not currently track per-player MLB at-bats, MLB innings pitched, or
active-roster service days as first-class fields. Cascade-vs-shadow
routing is therefore based on whether the record has cascade-shape
data (projectionAudit populated) rather than on a strict
rookie-eligibility check. Five distinct concepts -
prospect status, rookie eligibility, developmental stage,
organizational value, fantasy relevance - are not enforced as
separate fields in the cascade. See
shadow methodology §1
for the detailed five-concept distinction.
What the cascade does NOT measure
Plainly listed here so any reader citing cascade output can interpret it correctly. These are real model gaps, not roadmap promises:
- It does not model handedness or platoon splits. Lefty-vs-righty rate differentials are absent; the only platoon awareness is an explicit "platoon" workload-tier label that caps PA, not a rate split.
- It does not use pitch-mix data in rate regression. Pitch-mix data is loaded into state but is not consumed by the cascade (see TODO at app.js line 491).
- It does not use weather/park-weather data.
parkWeatheris loaded into state but never read by the cascade. - It does not project defensive value, base-running runs, or leverage-adjusted relief value. Bullpen leverage is computed for closer/relief role assignment but is NOT wired into save-rate projection.
- It does not project plate-discipline correlation. In Monte Carlo, walks and strikeouts are drawn independently rather than correlated through chase rate or zone contact.
- It does not apply position-specific aging curves. Aging is
archetype-based via
applyArchetypeAgingCurves(lines 14444-14455) but does not adjust per position. - It does not model minor-league prospects directly. Prospect records are handled by the separate prospect-shadow model.
- It does not produce a single composite ranking. The cascade
outputs
projPointsfor ranking; ranking-vs-value composition happens downstream in the fantasy-value layer.
2. Pipeline overview - two parallel branches
The cascade is invoked from render() at line 16353. It
splits into two parallel branches that share the same general stage
pattern but differ in the specific functions and rate categories:
render() ← line 16353
│
├─ Pitcher branch
│ groupByPlayer(pitchRows)
│ computePitcherProjection(rows, year, popModels) ← line 13253
│ │
│ ├─ Anchor + projAge + careerIP + pitchQualified (13254-13302)
│ ├─ Rate projection + shrinkage + age adjust (13336-13448)
│ ├─ Workload (IP tier + durability + ramp) (13450-13515)
│ ├─ Role retention probability (13548-13562)
│ ├─ Fantasy-points synthesis (K, ERA, H, BB,…) (13564-13776)
│ └─ projectionAudit assembly (~53 fields) (13851-13923)
│
├─ Batter branch
│ groupByPlayer(batterRows)
│ computeBatterProjection(rows, year, popModels) ← line 14183
│ │
│ ├─ Anchor + projAge + position resolution (14184-14220)
│ ├─ Rate projection + shrinkage + bounds (14228-14353)
│ ├─ Workload (PA tier + rookie ramp + slot) (14381-14441)
│ ├─ Context multipliers (lineup, park, prot.) (14379-14413)
│ ├─ Statcast compression (14457-14599)
│ ├─ Fantasy-points synthesis (R, TB, RBI, BB,…) (14586-14610)
│ └─ projectionAudit assembly (~30+ fields) (14626-14683)
│
│ Both branches converge:
├─ applyAvailabilityAndRiskToProjection(proj, currentYear) ← line 12604
│ (injury / role / callup penalties on top of cascade output)
│
├─ attachFantasyValueMetrics(proj) ← line 5773
│ (master confidence, talent tier, fantasy value)
│
├─ runMonteCarlo(proj, …) ← line 4594
│ (~10,000 simulations per player; produces percentiles,
│ floorPoints, upsidePoints, pctBust, pctElite)
│
└─ state.lastProjectionAuditRows = filtered.slice() ← line 16534
(downstream surfaces, including the Player Explainer,
read from this array.)
The cascade is not phase-numbered like the shadow model (P0-P5). It is organized by archetype (pitcher vs batter) and within each archetype by stage (anchor → rates → workload → context → fantasy points → audit). The shadow model's phase numbering reflects its incremental build history; the cascade's stage organization reflects the data dependencies between the steps.
3. Anchor & role determination
The first stage of each branch establishes a baseline "anchor" - the player's best recent career season - and resolves their projected age plus role.
3.1 Pitcher anchor (lines 13254-13302)
Identifies the best career year by composite score (innings pitched,
rate quality, recency). Resolves projAge from birthdate
plus current season. Determines whether the player qualifies for full
rate projection (pitchQualified) versus reduced shrinkage
(low-sample players). Pulls role state from
role_overrides: rotation lock, closer lock, swingman flag,
callup probability, injury status.
3.2 Batter anchor & position (lines 14184-14220)
Same pattern. Anchor selected from best recent career year. Position
resolved with explicit override-precedence: role-override file >
platform position > Statcast-derived position. The resolved
positionsLabel drives downstream workload tier assignment
(different PA defaults for catchers vs middle infielders vs corner
outfielders).
4. Rate projection - shrinkage toward population models
The cascade projects each stat category as a per-PA or per-IP rate, then multiplies by projected workload to get totals. Rate projection is the core empirical-Bayesian layer of the cascade.
4.1 Pitcher rates (lines 13336-13448)
For each rate (K%, BB%, hit-rate, ER-rate, HR-proxy):
- Compute observed rate from recent career sample.
- Look up population mean for the player's bucket
from
popModels.pitchers(segmented by age band and SP/RP role). - Compute stabilization weight from sample size against the rate-specific stabilization point (K% stabilizes faster than HR%, etc.).
- Blend: projected rate = observed × weight + population × (1 − weight).
- Apply age curve: rate adjusts per
projAgeagainst the archetype-specific aging path. - Apply stuff-score nudge: a higher
stuffScore(derived from velocity, IVB, extension, spin) lifts the K% projection within bounds.
Outputs: projK_rate, projER_rate,
projH_rate, projBB_rate,
projHRproxy_rate, kpctProj,
stuffScore.
4.2 Batter rates (lines 14228-14353)
Same pattern, different categories: R, TB, RBI, BB, K, SB. Each rate is shrunk toward a position-bucketed population mean (catchers have different population means than middle infielders, etc.). Additional defensive steps:
- Hard bounds applied to prevent runaway projections (e.g., TB-rate is capped at a position-tier-specific maximum).
- P99.5 shrinkage - any rate above the 99.5th percentile of historical rates for that bucket is shrunk back toward the 99th percentile. This is the cascade's analog of the shadow model's K9-floor defense.
- Statcast adjustments are NOT applied at the rate-projection stage directly; they enter as multipliers in §7.
Outputs: projR_rate, projTB_rate,
projRBI_rate, projBB_rate,
projK_rate, projSB_rate.
5. Workload modeling - IP (pitcher) / PA (batter)
5.1 Pitcher workload (lines 13450-13515)
Assigns the pitcher to a workload tier based on role,
injury history, archetype, and recent IP. The tier label is exposed as
workloadTierLabel in the audit block; common values
include workhorse, mid-rotation, back-end,
swingman, closer, setup, middle relief.
The tier determines:
workloadOuts- base projected outs for the season.workloadRampCap- a cap for pitchers returning from injury (the IL ramp).durabilityMult- durability-adjusted multiplier (1.0 for healthy pitchers; reduced for injury-prone profiles).archetypeDispersionMult- a multiplier on simulation variance: closer roles have higher dispersion than workhorse starters.
5.2 Pitcher role retention (lines 13548-13562)
A separate roleRetentionProbability multiplier
representing the probability the pitcher stays in their assigned role
through the season. Reads rotation locks, closer locks, depth-chart
rank, injury risk, prior-season starts. A 0.8 retention probability
scales final IP by 0.8 (the model expects 20% of innings to be lost
to role change).
5.3 Batter workload (lines 14381-14441)
Assigns the batter to a PA tier from a defined set:
full_time, regular, part_time, platoon,
backup, injury_replacement. The tier sets
workloadPA as the base projection.
Rookie ramp adjusts PA downward for first-year players based on
pedigree (rookieRampMult). Lineup slot is either taken
from explicit override or guessed from position and tier
(slotGuess). The guessed slot influences R/RBI
multipliers (heart-of-order produces more runs and RBI per PA).
6. Context multipliers - park, lineup slot, lineup protection
Once base rates and workload are projected, context multipliers adjust the totals. Lines 14379-14413 (batter side; pitcher side has analogous park adjustments in the fantasy-point synthesis layer).
- Park factors. Loaded from
parkFactorsByTeamId(app.js line 2348). Applied as late multipliers on rates (e.g.,parkER,parkTB) - they adjust totals after the rate × workload product. They are not used as inputs to the rate-projection stage. - Lineup-slot multiplier. Heart-of-order slots
(3-4-5) boost R and RBI relative to leadoff or bottom-of-order
slots. Applied via
lm.r,lm.rbi,lm.paobjects. - Lineup protection.
lineupProtMultapplies a small multiplier when a strong run-producer hits behind the batter, modeling the "pitched-around" effect. - Fringe caps. For low-tier players, total projections are clamped to prevent runaway projections from unstable rates × moderate PA.
These multipliers are applied multiplicatively. They are NOT position-adjusted in any sophisticated way (e.g., the lineup-slot multiplier is the same for catchers and outfielders) - see §16.
7. Statcast compression layer (batters)
Lines 14457-14599. For batters, Statcast-derived features compress into multipliers applied to power and contact projections:
- Power multiplier - composed of barrel rate, exit velocity (top 50%, average best speed), and hard-hit percentage. Combined via geometric mean rather than sum, to avoid double-counting (a player with high barrel and high hard-hit shares correlated signal).
- Contact multiplier - composed of whiff percentage, out-of-zone swing percentage, and zone contact rates.
- Speed multiplier - sprint speed, applied primarily to SB rate.
- xwOBA regression - when actual wOBA exceeds xwOBA by a sample-size-dependent threshold, the projection regresses partway back to xwOBA. The threshold widens with smaller sample sizes (more tolerance for noise on small samples).
The Statcast layer is the batter cascade's most empirically-rich
component. For pitchers, the analogous layer is the
stuffScore nudge inside rate projection (§4.1) plus xERA
blending inside fantasy-point synthesis (§8); the pitcher cascade does
not have a dedicated Statcast compression block.
8. Fantasy-point synthesis
The synthesis stage converts projected rates × workload into
per-category totals, then weights them by the league's scoring system
into the headline projPoints.
8.1 Pitcher synthesis (lines 13564-13776)
For each scoring category:
- K =
projK_rate × workloadOuts, then rounded. - ERA = a blend of empirical ERA and xERA, the blend weight depending on sample stability.
- H, BB = rates × outs, with hit-rate adjusted by park's hit factor.
- W = a function of team win expectancy and the pitcher's quality (better pitchers on better teams accrue more wins per start).
- L = analogous to W but inversely.
- SV, BS = role-conditional (closer multiplier × team save opportunities).
The cascade applies a final sanity cap: if
rawProjPts exceeds an archetype-specific ceiling, the
result is hard-capped and the difference is recorded as
spikeRisk (so the audit shows the model was constrained).
8.2 Batter synthesis (lines 14586-14610)
For each category (R, TB, RBI, BB, K, SB), totals are computed
as rate × PA × context_multiplier. Aging effects and
talent-ceiling clamps are applied as final modulators.
talentCeiling is a per-tier maximum that prevents low-tier
batters from projecting elite output even with favorable contextual
factors.
The synthesis stage emits projPoints,
rawProjPts, projPointsFloor, and
projPointsCeiling. The Floor and Ceiling are first-pass
estimates that are refined by the Monte Carlo layer
(§9).
9. Monte Carlo simulation
runMonteCarlo() at line 4594. After the deterministic
cascade produces a point estimate, ~10,000 simulations (configurable
via MC_SIMS_FINAL or MC_SIMS_QUICK) are run
per player to produce the full outcome distribution.
9.1 State-based simulation
Each simulation first draws a role state before drawing per-category outcomes:
- healthy - full PA / IP draw against projected workload
- injured - capped PA / IP based on injury severity priors
- breakout - sampled from upper tail; young-player skewed
- collapse - sampled from lower tail; old-player skewed
- demoted - reduced PA / IP path
- role_gain - promoted (e.g., setup → closer mid-season)
Breakout and collapse probabilities are age-conditional: young players carry ~12% breakout / 6% collapse priors; older players carry ~6% breakout / 24% collapse priors.
9.2 Per-category draws
Within each simulation, per-category rates are drawn from distributions centered at the projected rates with variance set by stabilization confidence. Categories are drawn independently (e.g., BB and K rates are not correlated through chase rate) - see §16.
9.3 Outputs
- Percentiles - P10, P25, P50, P75, P90 of fantasy points across all simulations.
floorPoints- the P10.upsidePoints- the P90.medianPoints- the P50 (typically used as the displayed projection).pctBustandpctElite- see §11.
The MC layer is the cascade's most computationally expensive stage.
Quick mode (MC_SIMS_QUICK) is used for live UI; final mode
(MC_SIMS_FINAL) is used when the user explicitly requests
full recomputation.
10. Talent-tier classification
computeTrueTalentTier() at line 15173. Assigns each
player to one of five tiers:
| Tier | Label | Pitcher gate (example) | Batter gate (example) |
|---|---|---|---|
| 1 | superstar | K% > 0.30 + IP > 180, or stuff score > 0.55 | peak TB rate top 0.5% + sustained PA |
| 2 | elite | K% > 0.28 + IP > 150, or stuff > 0.40 | peak TB top 2% + multiple seasons |
| 3 | quality starter / regular | K% > 0.22 + IP > 120 | solid current-form + adequate sample |
| 4 | role / platoon | lower IP or marginal rates | limited PA tier or weak hitter grade |
| 5 | fringe | minimal MLB sample, no rate stability | fringe-roster signal |
Tier assignment is multi-gate and conservative. A player needs to meet either current-form criteria OR peak-evidence criteria to qualify for tiers 1-2; meeting both produces high-confidence assignment.
11. Master confidence - six-component composite
computeMasterConfidence() at line 5729. Multiplicative
composite:
masterConfidence = rosterConf
× roleConf
× sampleConf
× marketConf
× √survivorship
× √realPlayerProb
× spikeRiskMult
Component definitions:
- rosterConf - probability of 40-man roster
retention (
computeRosterConfidence). - roleConf - probability the player gets the role the projection assumes (e.g., closer, full-time bat). Computed from rotation/closer locks and depth-chart rank.
- sampleConf - stabilization-weighted confidence in the rate projections. Closer to 1.0 for players with multiple full seasons; closer to 0.5 for low-sample players.
- marketConf - ADP/expert-consensus strength. Players with tight expert agreement and clear ADP receive higher confidence; players with wide expert disagreement receive lower confidence.
- survivorship - adjustment for inherent survivor-bias in the dataset (we mostly observe players who succeeded enough to keep getting drafted/projected).
- realPlayerProb - probability the record refers to a real, active player (filters out stale records).
- spikeRiskMult - penalizes confidence when the projection was hard-capped (the spike-risk flag from §8.1).
The composite is multiplicative, so a single very-low component significantly reduces confidence. This is intentional: a player with great rate confidence but no role lock should not be projected with high overall confidence.
12. Bust risk + elite-season probability
Both come from the Monte Carlo distribution (§9), not from separate computations.
pctBust(line 4801) - fraction of simulations scoring less than 60% of the P50 median. A high pctBust means the simulation distribution has heavy left-tail mass; the player has a meaningful chance of disastrously underperforming.pctElite(line 4806) - fixed at 8% by construction. This is the simulation count above the elite-threshold, which is itself defined as the top 8% of the simulated outcomes. It is NOT a population-relative percentile; it is a per-player distribution percentile. A "high pctElite" therefore does not mean the player is exceptional - it means the upper 8% of their simulated outcomes was high.
The 8% elite threshold is a construction choice. The cascade uses it as a consistent yardstick for the upper tail; users should not read it as "8% chance this player is elite" without qualification.
13. Audit assembly - what projectionAudit contains
The projectionAudit block (assembled at lines 13851-13923
for pitchers, 14626-14683 for batters) is the cascade's transparency
surface. It is attached to each player record as
rec.projectionAudit and is the primary data source for the
Player Explainer's cascade-projection sections.
Field categories:
| Category | Fields |
|---|---|
| Workload | workloadTierLabel, workloadIP,
workloadPA, ipLo/ipHi,
paFloorApplied |
| Talent | talentTier, talentCeiling,
projAge, breakoutMult |
| Confidence | masterConfidence, rosterConfidence,
roleConfidence, sampleConfidence2,
marketConfidence, orgTrust |
| Rates (batter) | projTBrate, projRrate,
projSBrate, projBBrate,
projKrate, peakTBrate |
| Rates (pitcher) | kpctProj, stuffScore,
workloadOuts |
| Risk | callupRisk, platoonRisk,
roleRetentionProbability,
durabilityMult,
archetypeDispersionMult,
spikeRisk |
| Role | pCloser (probability of closer role),
workloadRampCap (IL ramp) |
The audit block is read directly by the Explainer's
_renderHeadline, _renderWhy,
_renderSensitivity, and other render functions. Every
number a user sees in those sections corresponds to a field in this
block.
14. Required inputs
The cascade reads broadly from both per-player record fields and global state.
14.1 Batter rec fields
player_id, player_name, pa,
year, r_run, b_total_bases,
b_rbi, walk, strikeout,
r_total_stolen_base, b_game, plus
Statcast: xwoba, woba,
barrel_batted_rate, hard_hit_percent,
avg_best_speed, sprint_speed,
whiff_percent, out_zone_swing_percent,
avg_swing_speed.
14.2 Pitcher rec fields
player_id, player_name, p_out,
year, p_strikeout, p_earned_run,
p_total_hits, p_walk, p_win,
p_loss, p_save, p_blown_save,
p_starting_p, p_game_in_relief, plus
Statcast: xera, p_era, k_percent,
bb_percent, whiff_percent,
barrel_batted_rate, hard_hit_percent.
14.3 Global state reads
role_overrides- explicit role locks, pedigree, callup/platoon riskinjury_map,injury_records- IL duration estimate, injury historyplayerTeamEntry- team affiliation, team win%parkFactorsByTeamId,parkWeather- park multipliers (weather data loaded but unused)popModels.pitchers,popModels.batters- age-regressed rates by position/bucketcalibratedFvWeights,expertConsensus,adpData- confidence and value calibration
15. Version status
As of the latest methodology revision, the projection cascade carries
a methodology version stamp on every output: projectionAudit.methodologyVersion = 'cascade-v1.0'.
This closes the gap previously documented in this section.
Both pitcher and batter cascade audit objects (constructed in
app.js at the two projectionAudit assignment
sites) include the field. Cascade-derived numbers in the Player Explainer,
the research-snapshot export, and the Operations workspace now all carry
the version reference.
15.1 Version history
- cascade-v1.0 - initial version stamp introduced. Corresponds to the methodology as published at the time of stamp introduction. Pre-v1.0 outputs lack a version field; for those, reproducing exact computations requires reconstructing the code state from that date by other means.
15.2 Version-bump policy
Future methodology changes that alter cascade computations should be
paired with a version bump here AND in the projectionAudit.methodologyVersion
string. Documentation drift between this page and the audit-block stamp
should be treated as a real bug, not a cosmetic inconsistency.
16. Limitations
Known limits of the cascade as it operates today. Each is a real model property, not an apology:
- No methodology version exported. See §15.
- No handedness / platoon split projection. The only platoon-aware element is a "platoon" workload tier that caps PA; rate projections are platoon-blind.
- Pitch-mix data loaded but unused. A
pitch_mixfield exists in state but the cascade's rate-projection stage does not read it (TODO at app.js line 491). - Park effects applied as late multipliers, not as rate-projection inputs. Park-adjusted rates would require a different shrinkage model; the current approach treats park as a context post-hoc modifier.
- Weather data loaded but unused.
parkWeatheris fetched and stored but the cascade does not consume it. - Categories drawn independently in Monte Carlo. BB and K are not correlated through chase or zone-contact rate; TB and BB are not correlated through plate-discipline shape. Real players show correlations the simulation does not model.
- Defensive value not projected. Batters are modeled as pure offensive contributors.
- Bullpen leverage computed but not wired to SV projection. Leverage is used for closer-role assignment but does not affect projected save rate.
- Aging archetype-based, not position-based. Catchers and outfielders use the same aging curve when their archetype matches, despite catchers' faster real-world decline.
- Lineup-slot multiplier is uniform across positions. Hitting third in the order produces the same R/RBI lift for a catcher as for an outfielder.
- Monte Carlo "elite" threshold is fixed at 8%.
pctEliteis the per-player top-8% of simulations; it is not a population-relative percentile. Users should not interpret a high pctElite as "this player is elite." - The cascade is invoked synchronously during
render(). Re-projection happens on each full render; there is no incremental update. - Survivorship-bias adjustment is constant.
The
survivorshipfactor in master confidence is a fixed scalar, not data-driven per player.
17. Related methodology
- Prospect-shadow model (v0.5)
- the parallel model layer that runs on prospect records (vs the
cascade which runs on MLB-active players). The two models do not
share computation; the shadow model produces its output
independently and attaches it to
rec.shadow. - FYPD methodology (v1.0) - the first-year-player draft market observation layer (also independent of the cascade).
- Product & design principles - platform-level discipline including §2 "Trust Before Mathematics" which motivates this methodology page's existence.
The three model layers (cascade, prospect-shadow, FYPD) are intentionally separated. Each has its own input scope, its own methodology, and its own output surface. Composing them into a single ranking is a downstream concern, not a model-layer concern.
17.1 Cross-surface vocabulary notes
Two words appear with structurally different meanings across the
platform's model layers: "confidence" (used in
masterConfidence here in the cascade, but also in shadow's
confidenceBucket, identity-graph join confidence, injury
reports, position attribution, and the disagreement signal) and
"archetype" (used in the cascade's workload,
aging, and volatility classifications, and separately in the shadow
model's prospect-archetype buckets and the Operations portfolio
archetypes). Each usage is correct within its own context; the
overloading is at the platform level.
The cross-surface translation tables are documented in shadow methodology §16.4. Refer there before comparing "confidence" or "archetype" values across different surfaces.
18. How to cite
When citing cascade output in analytical writing:
- Source: managr Projection Cascade
- Methodology version:
projectionAudit.methodologyVersion(e.g.,cascade-v1.0); see §15. Pre-v1.0 outputs lack this stamp. - Date: the date the projection was computed.
- Link to this methodology page.
Example: Per managr Projection Cascade v1.0 (computed 2026-05-17), Player X projects to 425 fantasy points (P10 280, P90 560) with master confidence 0.72 and talent tier 2.