Skip to content
← Back to Analytics

Model Calibration Analysis

Active sport: nfl

This page analyzes the Power Rating Model's performance on historical NFL data (2024–2025 seasons) to ensure its predicted probabilities are well-calibrated.

What is Calibration?

A model is well-calibrated if its predicted probabilities match actual outcomes. For example, if we look at all the times the model predicted a 70% chance of winning, the teams in that group should have actually won about 70% of the time.

Many models are good at ranking but produce poorly-calibrated raw probabilities. Betting with uncalibrated probabilities means you are miscalculating your edge.

This application uses Platt Scaling , a logistic regression model trained on the raw outputs of the main model, to correct for systematic bias and produce well-calibrated probabilities.

Reliability Diagram (2024–2025 Track Record)

This chart plots the model's predicted probability (x-axis) against the actual win frequency (y-axis). A perfectly calibrated model would follow the dashed diagonal line.

Bin 30-40%: predicted 37.9%, actual 41.2%, n=3430-40%Bin 40-50%: predicted 43.7%, actual 63.5%, n=25240-50%Predicted ProbabilityActual Frequency

Further Research: Isotonic Regression

While Platt Scaling is effective, especially when the calibration curve is sigmoidal, another powerful technique is Isotonic Regression.

Unlike Platt Scaling, which assumes a specific logistic function, Isotonic Regression is a non-parametric method. It finds the best-fitting monotonically non-decreasing function to map raw probabilities to calibrated ones. This allows it to fit more complex calibration curves without being constrained to a sigmoid shape.

For PhD-level research, comparing the performance of Platt Scaling against Isotonic Regression on your specific dataset would be a valuable exercise. Scikit-learn provides a robust implementation of both, making such a comparison straightforward to implement.

We use cookies for essential site functionality. With your consent, we also use cookies for analytics and performance monitoring. See our Privacy Policy.