MLB — Wagner Playground

Games

—

On Kalshi

—

With Signal

—

View

Full Game

Mode

Dry Run

Near-Certain

—

⚠ Pricing guidance — evaluate entry price before acting click to expand

Post-4/21 analysis shows the model’s WR is solid across all buckets (60-70%), but full-game buckets are bleeding because entry prices exceed what the edge supports. The counterfactual backtest of the 4/24 pitcher-threshold fix showed no P&L improvement on 4/21-4/23, confirming the losses are price-driven, not model-driven.

F5 — all prices

Working as designed. F5 OVER is the cleanest cell (75%+ WR).

Full Game / UNDER ≤ 70¢

Need ~60% WR to break even — within model’s actual 69% WR. OK.

Full Game / UNDER > 70¢

Needs 70%+ WR to break even. Historical WR is 69%. Skip.

Full Game / OVER > 60¢

50% historical WR. Coin flip being priced as favorite. Skip.

No hard gate in code

Signals still fire at all prices; this is informational guidance while we collect more data.

v2.2 agreement — two EV-positive cells (n=343 backfill, 4/13-4/24)

Backfill of all historical signals against the v2.2 (offense + home/road splits) model revealed an asymmetric pattern. Both EV-positive cells now get a mauve badge on the signal pill:

✓ OVER + agrees

74% WR · +18% ROI

n=73 ✅ act on it

✗ UNDER + disagrees

76% WR · +16% ROI

n=45 ✅ act on it

OVER + disagrees

67% WR · +5% ROI

n=9 (tiny n)

UNDER + agrees

64% WR · −8% ROI

n=126 🚨 weakest

Counter-intuitive but real: on UNDER signals, v2.2 disagreement (i.e., v2.2 expects more runs than v1) is when v1's UNDER call is at its best. The mauve ✗ v2.2 badge marks these. UNDER signals where v2.2 agrees are the worst cell in the entire model lineup — 64% WR isn't enough at the avg entry price.

Why? UNDER alpha lives in volatile pitcher matchups where v1's pitcher-only model nails it but v2.2's offense factor pushes the projection up. When both models agree on UNDER, it's a low-variance game and the price already reflects that — no edge.

Near-Certain Plays

F5 Over 0.5 / 1.5 under 85¢ — ~95% base rate

Game	Time	Play	YES Ask	Avail	Est. Hit Rate	Est. ROI	Link

Signal Tracking Performance

model-only, $100/signal hypothetical

loading...

Loading...

Game	Time	Probable Pitchers	Env ⓘ	Expected ⓘ	Moneyline	Total Ladder ⓘ	Signal	Link
Loading MLB scanner…

How It Works

1. Expected Total

Base (8.6 FG / 4.2 F5) × park factor × temperature × wind × pitcher quality (composite ERA + WHIP + K/9 + BAA). F5 uses 1.5× pitcher amplification — starters carry 100% of F5 with no bullpen dilution. Domes mute weather; retractable roofs half-strength.

2. Probability

Normal distribution around expected mean (σ=3.0 full game, 1.9 F5). P(Over X.5) = 1 − Φ((X.5 − mean) / σ).

3. Edge

Per threshold: ourProb − yesAsk (over) or (1 − ourProb) − noAsk (under). Signal fires when edge ≥ 5pp, price in 15-85¢.

4. Data

Schedule, weather, linescore from statsapi.mlb.com. Pitcher hand + season ERA, WHIP, K/9, BAA, IP from batched people endpoint. Kalshi markets from 7 MLB series.

Entry Price & P&L

On binary contracts, every loss costs the same ($100 on a $100 stake) regardless of entry price. But wins pay out inversely to what you paid. This asymmetry means entry price matters as much as edge percentage when choosing between signals.

Entry	Win Profit	Loss	Payout Ratio	Break-even WR
25¢	+$300	−$100	3 : 1	25%
35¢	+$186	−$100	1.9 : 1	35%
50¢	+$100	−$100	1 : 1	50%
65¢	+$54	−$100	0.5 : 1	65%
80¢	+$25	−$100	0.25 : 1	80%

When picking between signals: a 10pp edge at 40¢ is worth materially more than a 10pp edge at 70¢. The 40¢ signal makes 2.5× more on a win with the same $100 downside. Weight entry price alongside your baseball read.

Why concentration happens: a single win at 26¢ (+$285) produces more P&L than the next four wins at 55¢ combined (+$82 each). The top games in the P&L table almost always have the cheapest entries, not the highest edge.

The trap: cheap entries (<35¢) win less often — Kalshi is pricing them cheap for a reason. The edge has to be large enough to overcome the lower base rate. An 8pp edge at 30¢ is better math than a 12pp edge at 70¢, but only if the model's probability estimate is actually correct at that tail.