Back to guides
NFL 15 min read

Modeling NFL Like an Analyst: Spread, Total, and Player Props from Scratch

Read the price, role, and market first

Build an NFL player prop model and matching spread + total models from scratch — features, calibration, vig removal, and Kelly sizing in your browser.
Shark Snip Editorial 17 sections
Modeling NFL Like an Analyst: Spread, Total, and Player Props from Scratch cover art

The phrase "NFL model" gets thrown around loosely. To a tout it means "today's pick." To a system bettor it means "any team off a loss covers." To an analyst it means a numerical projection with a calibrated probability attached, evaluated against closing lines on out-of-sample data, sized with a fractional Kelly stake. This guide walks the analyst path end to end — spread, total, and an nfl player prop model — using only public data and a browser. You can run every step in the Shark Snip Workshop while reading.

What a model actually is (and what it isn't)

A model is a function that maps inputs to a probability or expected value. It is not a pick, a system, or a vibe. The distinction matters because each of those three competing things gets called a "model" in betting Twitter, and they fail in different ways.

  • A pick is a single output without a probability. "Take the Eagles -3" tells you nothing about confidence, sample, or edge. Picks are unfalsifiable in any rigorous sense.
  • A system is a hard rule applied to every game that fits the criteria. "Home dogs after a road loss" is a system. Systems are testable but rigid — they ignore context that a continuous model captures naturally.
  • A model outputs a number — predicted margin, predicted total, predicted prop, or a probability. It can be evaluated on Brier score, log-loss, and realized return. It updates when new data arrives. It is wrong in measurable ways and improvable through measurable changes.

If you compare your output to the line and decide whether to bet, you have something model-like. If you also compute a probability, evaluate calibration, and size with Kelly, you have a model. That is the bar the rest of this article assumes. For market mechanics that the model has to beat, the spread mechanics primer covers the half-point key numbers, and the sharp vs public guide covers how lines move.

Building a baseline NFL spread model

Spread models predict either game margin or cover probability against the posted line. The cleanest target is margin — home final score minus away final score — because everything downstream (cover probability, expected value, Kelly stake) is a deterministic function of that one number plus a noise estimate.

Step 1: Set up the target and the split

Pull NFL play-by-play from the public nflverse data (2018-2024 gives you about 1,800 regular-season games, enough to train without overfitting). Aggregate to one row per game with home and away final scores, the closing spread, and the closing total. Set margin = home_score - away_score as your target.

Split walk-forward, never randomly. A typical split is 2018-2022 train, 2023 validate, 2024 test. Random splits leak future information into the training set because the league trends each season — pass rates, kicker accuracy, hash-mark rules — and a random shuffle gives the model peeks at outcomes it would not have at bet time.

Step 2: Compute opponent-adjusted Elo and EPA

Three features carry most of the predictive weight beyond the closing line itself:

  1. Net yards per play differential. Offensive yards per play minus defensive yards per play allowed, last eight games, opponent-adjusted. Yards is noisier than EPA but more transparent and almost as predictive.
  2. Net EPA per play. Expected Points Added on offense minus EPA allowed on defense. Pull from nflverse pre-computed for free. Trailing eight games, opponent-adjusted, dropping garbage time (win probability inside 5-95%).
  3. Opponent-adjusted Elo. Start each team at 1500, add a home-field bump of about 1.7 points, update after each game with K = 20. Convert the Elo gap to a point-spread expectation with the standard 25-point ratio (a 25 Elo gap implies a 1-point spread).

These three plus the closing spread, starting QB availability, rest differential, and travel distance form a seven-feature baseline that closes most of the gap to a sharp number. The closing line itself is the dominant feature — it aggregates everyone else's models. Your job is to find the small marginal lift that comes from the other six. Open the Shark Snip builder to wire these features into a blueprint visually.

Step 3: Train and evaluate

Train any of: a ridge regression, a gradient-boosted tree, or a small neural net. For 1,800 games and seven features, ridge wins on simplicity and rarely loses on accuracy versus a properly regularized GBM. Evaluate on RMSE and on against-the-spread cover rate. Healthy numbers for an NFL margin model on a recent test season:

  • Validation RMSE: 12.5 to 14 points
  • ATS cover rate: 51% to 54% on a 200-game test season
  • Mean absolute error vs closing line: under 4 points

If your test cover rate is above 56% on a small sample, suspect data leakage before celebrating. The most common leak is rolling stats that include the current game in the average — use only the prior eight games, lagged by one. The injury impact study covers another classic leak: announcing inactives before the close.

Building a totals model

Totals models look easier because both teams contribute, but they are sneakier. Pace, neutral pass rate, and weather drive totals more than raw scoring efficiency, and the closing total absorbs less information from the public than the closing spread does. That makes the totals market both more beatable and more punishing when you are wrong.

Pace and neutral pass rate

Pace is plays per minute when the game is competitive — exclude two-minute drills and garbage time. Neutral pass rate is the share of plays that are passes when win probability sits between 25% and 75%. These two features capture how much volume each offense will generate, separate from how efficient that volume is. The same offense at 70 plays per game produces dramatically more variance in totals than at 60.

Weather and venue

Weather is binary-ish: dome games are weatherproof, outdoor games face wind first, then rain, then temperature. Wind above 15 mph drops expected total points by roughly 3-5 depending on the matchup. Rain shifts run-pass mix and lowers efficiency. Cold under 25°F has a smaller measurable effect than the popular narrative suggests. Pull current conditions from a free weather API at bet time, not the historical average for the venue.

Divisional dampening

Divisional rematches in November and December tend to score under their projection by 1-3 points. The simple explanation is film: both coordinators have studied each other twice, surprise plays disappear, and execution gets cleaner on defense than on offense. Code this as a flag: is_divisional AND week >= 10, then let the model decide the magnitude. The totals deep dive walks through the historical splits in detail.

Building a player prop model

Props are where the modeling work pays off most. The book posts hundreds of lines per week, the soft ones get shaded but rarely killed, and a calibrated projection beats a feel pick by a wider margin than on game lines. The trick is building from features upward rather than guessing at outputs.

Features start with usage

Stat projections are usage times efficiency. Both halves matter, but usage is more stable week to week and is what you should anchor your model on. The four core usage features for skill-position props:

  • Snap percentage — share of offensive snaps the player is on the field. The ceiling on every counting stat.
  • Route share — share of pass plays where the player ran a route. For receivers and tight ends.
  • Target share — share of team targets the player drew. Combine with team pass volume for a target projection.
  • Carry share — for running backs, share of team rush attempts. Combine with red-zone carry share for TD modeling.

The target share vs air yards study walks through which of these stabilizes fastest after a role change — usage features can shift by 15+ percentage points in the first three games after an injury or trade, so freshness matters.

Layer aDOT and route mix on top

Average Depth of Target (aDOT) and route mix turn target volume into yardage. A 7-target receiver running deep posts (aDOT 18) projects to ~120 yards; a 7-target slot receiver running drags (aDOT 5) projects to ~50. The nflverse weekly receiver charts publish per-game aDOT and route distribution for free.

Build the projection

For receiving yards, the simple model is: projected_yards = team_pass_attempts * target_share * catch_rate * yards_per_reception. Train each multiplier separately on rolling samples — team pass attempts is a function of opponent pace and game script, target share is a player rate, catch rate and yards-per-reception are matchup-adjusted player rates. Then combine and add a residual normal distribution for variance. This separates which part of the model is wrong when the projection misses, instead of leaving you with a single opaque output.

From margin to cover probability — the math that matters

A margin prediction is not a bet recommendation until it becomes a probability. The standard move:

  1. Predict margin as a number, e.g., predicted_margin = +5.2 means home wins by 5.2 on average.
  2. Compute the residual: edge = predicted_margin - closing_spread. If the closing spread is +3 (home favored by 3) and you predict +5.2, your edge is 2.2 points.
  3. Treat that edge as the mean of a normal distribution with standard deviation matching the empirical residual RMSE — about 13.5 points for NFL game margins on a well-built model.
  4. Cover probability = 1 - normalCdf(0, mean=edge, sd=13.5). For an edge of 2.2 points: cover probability is 56.5%.

That feels small. The reason it feels small is that the public sees a 5.2 prediction against a 3 line and assumes a huge edge. The actual edge is 56.5% — meaningful, but not enough for a max bet. The standard deviation of NFL game margins is wide because football is high-variance. A two-point projection edge becomes roughly a 4-5 percentage point cover edge, not the 10+ that an unrigorous gut would assume.

Calibration: what 60% should mean

Calibration is the test of whether your probabilities mean what they say. A 60% confidence bet should win 60% of the time over a large sample. Three tools:

  • Brier score. Mean squared error between predicted probability and actual outcome (1 or 0). Lower is better. A coin flip scores 0.25; anything under 0.245 on NFL spreads is real signal.
  • Log-loss. Punishes confident wrong predictions harder than Brier. More sensitive to calibration drift at the extremes.
  • Reliability diagram. Bin predictions (e.g., 50-52%, 52-55%, 55-58%, 58-62%, 62%+) and plot the actual cover rate in each bucket. A perfectly calibrated model lies on the diagonal.

The reliability diagram above shows what a healthy spread model produces — predicted and actual within a percentage point at every bucket, sample sizes that justify the points. When the gap exceeds three points consistently in one direction, the model is biased; when it widens at one bucket only, that feature combination is broken. Either way, fix calibration before adding new features.

From model to bet: vig removal, Kelly, and line shopping

A calibrated probability is still not a bet. Three more steps separate a number from a sized wager.

Vig removal

Books post both sides with built-in margin. A standard NFL spread is -110/-110, which converts to implied probabilities of 52.38% and 52.38% — sum 104.76%, with the 4.76% being the hold. To get the no-vig probability, divide each implied probability by the sum: 52.38 / 104.76 = 50% on each side. The fair line is 50/50. Your edge is your model's probability minus the no-vig probability — not minus the raw implied probability, which would overstate your edge by the hold percentage.

Player props carry more vig — 4-8% on standard yardage and reception props, and over 10% on alternate lines. The sharp vs public guide walks through why prop hold is higher and how it varies by book.

Kelly fraction

The full Kelly formula for a -110 bet with probability p: stake_fraction = (p * 110 - (1 - p) * 100) / 110. For a 54% true probability, full Kelly is about 3.6% of bankroll. Almost no one bets full Kelly — variance is brutal, and any error in the probability estimate compounds. Quarter Kelly (0.9% of bankroll for a 54% bet) gives most of the long-run growth with much less drawdown.

The discipline trap: a model that claims 60% on every bet but is actually 53% calibrated will recommend Kelly stakes that bankrupt the bettor. Calibration matters more than raw cover rate for sizing decisions — see the FAQ entry on calibration above.

Line shopping

The same NFL spread varies across legal books by a half to a full point. On key numbers (3, 7, 10) a half-point of line value is worth 1.5-2% on win rate. Three or four legal books in your state captures most available shopping value. ESPN's scoreboard shows consensus lines for free; the actual best price requires checking the books themselves at peak market hours (Wednesday afternoon and Saturday morning are typical inflection points).

Backtesting honestly

The most common modeling failure is overfitting to a backtest. Three rules to keep yourself honest:

  1. Walk-forward only. Train on weeks 1-8, predict weeks 9-16, slide one week, repeat. A random split leaks information.
  2. Hold out a final test set you never look at until the end. Pick the 2025 season, run it once. If you tune to it, it stops being a test set.
  3. Charge real juice. -110 standard, -115 to -120 on alt lines, 5-8% on props. A backtest at zero juice is a fantasy.

The sharp vs public piece is worth re-reading after your first backtest — sharp money tends to enter the market mid-week, and a model that beat the opening line on Sunday's close has lower margin than the raw cover rate suggests.

Building all of this in your browser

Every step above runs in the Shark Snip Workshop. The blueprint editor lets you drag the seven baseline spread features into a model, train with TensorFlow.js (no server, no GPU bill), backtest with walk-forward splits, view calibration with a reliability diagram, and publish to the live picks pages. Specific entry points:

  • Open the Workshop for the guided builder with topic presets for NFL spread, NFL total, and NFL player props.
  • Open /build for the lower-level brick editor where you can add custom features.
  • Compare your output to other published models on the leaderboards — same training data, same evaluation harness, transparent live cover rates.
  • Once you trust your model, list it on the marketplace so other users can subscribe to its picks.
  • For the live front-end of NFL picks generated by published models, check /gridiron for the current week's slate with model edges shown.

What good looks like after one season

A first NFL spread model trained on the seven baseline features should land near these numbers after a full season of live betting:

  • Live ATS cover rate: 51-53% over 200+ bets
  • Brier score under 0.245 on the same sample
  • Calibration drift under three percentage points at every reliability bucket
  • ROI per bet between -1% and +3% at standard juice

Those numbers feel modest because they are. Sustained 53% ATS at -110 is roughly a 1.2% edge per bet, which compounds to meaningful returns over thousands of bets but never produces the 60%+ headline results that tout services advertise. The honest number is the durable one. If your live cover rate diverges from the backtest by three or more percentage points after 100 bets, retrain on the most recent season — the market or the league has shifted under you.

Where to go next

Once a baseline NFL spread model is live, the most valuable extensions are:

  1. A totals model with the pace, neutral pass rate, weather, and divisional features described above. Cross-check it against your spread model — if both like the home team to score 30, your projection is consistent.
  2. A player prop model anchored on snap, route, target, and carry share. Player props are higher variance per bet but lower correlation across bets, so they diversify a portfolio that already has spread and total exposure.
  3. Fractional Kelly sizing applied to all three model outputs, with a portfolio-level cap so no single game exceeds a chosen percentage of bankroll.

The target share study is the right next read for prop modeling depth. The injury impact study is essential for handling late-week status changes that move markets and your model. The spread mechanics primer and the totals deep dive are companion pieces in this NFL markets cluster.

A note on responsibility

Modeling does not eliminate variance. A 53% true edge can produce a 100-bet drawdown of 15+ units. Bankroll sizing matters more than feature engineering. Set a hard maximum stake, never chase, and treat any month where you exceed your limits as a loss regardless of P&L. The model is a tool; the bettor is the risk manager. Bet only what you can afford to lose, and use legal regulated books in your state.

Props and DFS example board

For props, DFS, and PrizePicks-style decisions, the names should reveal the input. Jokic assists, Shai points, Wembanyama blocks, Josh Allen rushing, Ja'Marr Chase receptions, and Christian McCaffrey touchdown equity all require different checks. Treat each player as a role-and-price puzzle rather than a logo on a pick card.

  • Fixed-line check: compare the app line to sportsbook consensus before calling it an edge.
  • Correlation check: do not pair legs that require opposite game scripts.
  • DFS check: salary, ownership, and late-swap flexibility can matter as much as median projection.
  • Tracking check: grade closing value and result separately so a lucky hit does not hide a bad line.

Use PrizePicks basics, NFL player props, and correlation math as the internal loop from projection to price to risk control.

Prop, DFS, and contest examples

Use names as evidence, not decoration. The useful SEO win is that Josh Allen, Ja'Marr Chase, Bijan Robinson and Puka Nacua and Eagles, Chiefs, Bills and Lions appear inside decisions, thresholds, and internal links instead of being dumped into a keyword list.

  • Prop EV example: if Amon-Ra St. Brown receptions are 6.5 at -120, a model median of 7.1 with a 56% over probability creates a fair threshold near -127; pass if the market jumps to 7.5 without a projection change.
  • DFS value example: projection divided by salary times 1,000 keeps the slate honest. A 20.4-point projection at $7,200 is 2.83x median value; tournaments need ceiling, leverage, and correlation on top of that.
  • Stack example: Patrick Mahomes with Travis Kelce and Xavier Worthy needs a bring-back plan from the opponent; Josh Allen with Keon Coleman and Dalton Kincaid needs rushing-TD cannibalization in the script notes.
  • PrizePicks example: Nikola Jokic rebounds, Devin Booker points, and Stephen Curry threes should not be treated as one generic “More” card; legs need hit rate, payout, and correlation checks.

The next step should be a tool, not another opinion: compare the line on NFL player props, pressure-test salary in DFS tools, and log the close with bet tracking.

Research note board

Use this board before clicking a prop, DFS build, or same-game entry. The table is intentionally about thresholds, not fake certainty.

StepInputExample applicationCancel rule
Project the roleSnaps, routes, targets, carries, minutes, or usageJosh Allen volume against the posted lineThe player loses the role that created the projection
Price the marketBreak-even odds, line shopping, hold, payout structurevig compared with sportsbook consensusJuice or line movement removes the edge
Check correlationGame script, teammate overlap, ownership, late newsJa'Marr Chase paired with Eagles script notesThe legs need different games to happen

Model calibration: predicted vs observed

Predicted win probability bucket vs the empirical win rate inside that bucket on the test set. Points on the y=x reference line are perfectly calibrated; points below mean the model is overconfident in that bucket.

Prop OVER hit rate vs line distance from median

Empirical hit rate of OVER bets as the prop line moves away from the player projection median, measured in standard deviations. A line set 1sd below the median hits ~84% of the time — but books price the juice to match.

Frequently asked questions

What is an NFL player prop model and why build one?
An NFL player prop model predicts a stat — passing yards, receptions, rush attempts — from upstream features like target share, route share, snap percentage, and opponent strength. You build one because the prop market posts hundreds of lines per week, books shade the soft ones, and a calibrated projection plus a vig-removed line gives you a real edge instead of a feel pick.
Do I need to know Python to build a spread or total model?
No. The Shark Snip Workshop runs TensorFlow.js entirely in the browser, so you pick features from a catalog, train, backtest, and publish without writing code. Python helps if you want to scrape custom data — but for a baseline NFL margin model on nflverse features, the in-browser builder gets you to a published model in under an hour with full walk-forward validation.
How many features should a first NFL spread model use?
Five to seven. The closing spread carries roughly 60% of the predictive weight on its own. Net EPA per play, starting QB availability, rest differential, and travel distance add the marginal lift. Stacking 30 features on a 1,800-game training set produces noise, not signal — every extra feature needs to clear an out-of-sample improvement bar before it earns a slot.
What is vig removal and why does it matter for prop models?
Books post both sides of a prop with built-in margin (the vig or hold) — typically 4-8% on NFL props versus ~4.5% on standard -110 spreads. To compare your projection against the implied probability, divide each side by the sum of both implied probabilities. A 53% projection against a 51% no-vig line is an edge; against the raw 50% with vig baked in it looks even bigger and would mislead your bet sizing.
How do I turn a margin prediction into a cover probability?
Treat the residual (your predicted margin minus the closing spread) as the mean of a normal distribution with standard deviation around 13.5 points — that is the empirical RMSE of NFL game margins on 2018-2024 data. The cover probability is one minus the cumulative normal at zero. A two-point edge through that math comes out near 54.4%, not the 60%+ a beginner often assumes.
What does calibration mean for an NFL betting model?
Calibration asks whether your stated probabilities match reality. When the model says 55%, do those bets actually cover 55% of the time? Reliability diagrams plot predicted vs actual across buckets, while Brier score and log-loss aggregate the gap into a single number. A model with 53% raw cover rate but bad calibration will still bleed money — Kelly sizing on miscalibrated edges is the fastest way to ruin.
Should I bet every model edge or filter to the biggest ones?
Filter, but with a tested threshold. Sort historical bets by predicted edge and check cover rate per bucket — a healthy model shows monotonic lift (51% at 0-1 point edge, 57%+ at 5+ points). Bet the top three buckets where the lift is real. Betting every nominal edge inflates volume, increases vig drag, and dilutes the bankroll growth that fractional Kelly sizing is meant to capture.
How is NFL totals modeling different from spread modeling?
Totals weight pace, neutral pass rate, and weather more heavily and care less about clutch-time efficiency. Wind above 15 mph drags expected points by roughly 3-5 depending on dome status; rain shifts the run-pass mix; divisional rematches in December dampen totals because both coordinators have film. Re-fit your totals model with these features rather than reusing the spread feature set unchanged.
What is line shopping and how much edge does it add?
Line shopping means checking multiple sportsbooks for the same market and taking the most favorable price. On NFL spreads a half-point of line value is worth roughly 1.5-2% on win rate at the key numbers (3, 7, 10), and on player props the spread between books at peak market hours can hit a full unit. Three to four legal books in your state typically capture most of the available edge without operational overhead.
Where can I publish my NFL model and track its live performance?
Trained models in the Shark Snip Workshop publish straight to the picks pages and the leaderboards. Live cover rate, edge per bet, sample size, and rolling Brier score update after each game. If live performance diverges from the backtest by more than three percentage points across a 100-bet sample the market has likely caught up — retrain on the most recent season and audit which features have decayed.

Build a free model in 60 seconds →

Go →
15m read time
29 players/teams
12 key angles
Angles in this read 6 angles
Target heat fantasy
Tier stack fantasy
Snap meter fantasy
Ownership leverage dfs
Correlation web correlation
Edge meter edge

NFL 2026 market context

NFL betting examples work best when quarterback, team, and market context stay attached: Chiefs/Bills/Ravens/Eagles/Lions angles should connect to price, schedule, injuries, and game environment.
Patrick MahomesJosh AllenLamar JacksonJoe BurrowJalen HurtsJustin HerbertC.J. StroudTua TagovailoaChiefsBillsRavensEaglesLionsBengalsclosing line valuetarget shareair yardsred-zone roleroute participation
Modeling NFL Like an Analyst: Spread, Total, and Player Props from Scratch data infographic
Chart view of the article's core numbers. Source: inline-nfl-modeling-calibration.

Start free — pick NFL

Go →