Games Signal History

MLB Market Scanner

Weather + park + pitcher model vs Kalshi Over/Under ladder. Phase 2 — Dry run · signals observed, no trades

Games

On Kalshi

With Signal

View

Full Game

Mode

Dry Run

Near-Certain

Pricing guidance — evaluate entry price before acting click to expand

Post-4/21 analysis shows the model’s WR is solid across all buckets (60-70%), but full-game buckets are bleeding because entry prices exceed what the edge supports. The counterfactual backtest of the 4/24 pitcher-threshold fix showed no P&L improvement on 4/21-4/23, confirming the losses are price-driven, not model-driven.

F5 — all prices

Working as designed. F5 OVER is the cleanest cell (75%+ WR).

Full Game / UNDER ≤ 70¢

Need ~60% WR to break even — within model’s actual 69% WR. OK.

Full Game / UNDER > 70¢

Needs 70%+ WR to break even. Historical WR is 69%. Skip.

Full Game / OVER > 60¢

50% historical WR. Coin flip being priced as favorite. Skip.

No hard gate in code

Signals still fire at all prices; this is informational guidance while we collect more data.

v2.2 agreement — two EV-positive cells (n=343 backfill, 4/13-4/24)

Backfill of all historical signals against the v2.2 (offense + home/road splits) model revealed an asymmetric pattern. Both EV-positive cells now get a mauve badge on the signal pill:

✓ OVER + agrees
74% WR · +18% ROI
n=73 ✅ act on it
✗ UNDER + disagrees
76% WR · +16% ROI
n=45 ✅ act on it
OVER + disagrees
67% WR · +5% ROI
n=9 (tiny n)
UNDER + agrees
64% WR · −8% ROI
n=126 🚨 weakest

Counter-intuitive but real: on UNDER signals, v2.2 disagreement (i.e., v2.2 expects more runs than v1) is when v1's UNDER call is at its best. The mauve ✗ v2.2 badge marks these. UNDER signals where v2.2 agrees are the worst cell in the entire model lineup — 64% WR isn't enough at the avg entry price.

Why? UNDER alpha lives in volatile pitcher matchups where v1's pitcher-only model nails it but v2.2's offense factor pushes the projection up. When both models agree on UNDER, it's a low-variance game and the price already reflects that — no edge.

Signal Tracking Performance

model-only, $100/signal hypothetical
loading...
Loading...
Game Time Probable Pitchers Env Expected Moneyline Total Ladder Signal Link
Loading MLB scanner…

How It Works

1. Expected Total

Base (8.6 FG / 4.2 F5) × park factor × temperature × wind × pitcher quality (composite ERA + WHIP + K/9 + BAA). F5 uses 1.5× pitcher amplification — starters carry 100% of F5 with no bullpen dilution. Domes mute weather; retractable roofs half-strength.

2. Probability

Normal distribution around expected mean (σ=3.0 full game, 1.9 F5). P(Over X.5) = 1 − Φ((X.5 − mean) / σ).

3. Edge

Per threshold: ourProb − yesAsk (over) or (1 − ourProb) − noAsk (under). Signal fires when edge ≥ 5pp, price in 15-85¢.

4. Data

Schedule, weather, linescore from statsapi.mlb.com. Pitcher hand + season ERA, WHIP, K/9, BAA, IP from batched people endpoint. Kalshi markets from 7 MLB series.

Entry Price & P&L

On binary contracts, every loss costs the same ($100 on a $100 stake) regardless of entry price. But wins pay out inversely to what you paid. This asymmetry means entry price matters as much as edge percentage when choosing between signals.

Entry Win Profit Loss Payout Ratio Break-even WR
25¢+$300−$1003 : 125%
35¢+$186−$1001.9 : 135%
50¢+$100−$1001 : 150%
65¢+$54−$1000.5 : 165%
80¢+$25−$1000.25 : 180%

When picking between signals: a 10pp edge at 40¢ is worth materially more than a 10pp edge at 70¢. The 40¢ signal makes 2.5× more on a win with the same $100 downside. Weight entry price alongside your baseball read.

Why concentration happens: a single win at 26¢ (+$285) produces more P&L than the next four wins at 55¢ combined (+$82 each). The top games in the P&L table almost always have the cheapest entries, not the highest edge.

The trap: cheap entries (<35¢) win less often — Kalshi is pricing them cheap for a reason. The edge has to be large enough to overcome the lower base rate. An 8pp edge at 30¢ is better math than a 12pp edge at 70¢, but only if the model's probability estimate is actually correct at that tail.