As a lifelong fan of the classic trivia gameshow Jeopardy! and a data analyst with an eye on prediction and modeling, the Tournament of Champions presents a unique opportunity in the Jeopardy! world to analyze the past performance of players to predict the outcome when the best players from the season go head-to-head.
Players in the tournament are seeded based on the number of consecutive games won and total money earned during their streak. These stats are indicative of a player’s skill to be sure, but the variance of the game led me to believe that there was a better way to analyze the skill of each player. Jeopardy! keeps a detailed box score database showing a number of key statistics, where I was able to gather all the data needed to build my model. Initially I had intended to automate this via a scraping tool, but given the small amount of data as well as the needed modifications for my custom stats I found it was more effective to compile and clean the data manually.
I began by asking the question “what makes a Jeopardy! player great?”. The ideal Jeopardy! contestant starts by having a wealth of knowledge on a variety of subjects. It does not end there of course, as there are many other factors to success – their skill with the buzzer, their ability to find Daily Doubles – and wager boldly – as well as their ability to nail the toughest questions in Final Jeopardy.
Taking these factors into consideration and looking at the available Jeopardy! stats, I compiled my list of stats that would be used in the model. Most of them were immediately available from Jeopardy! box scores, but some I modified to eliminate bias and I created three key stats as derivatives from the available data. The full list of stats and explanations of modifications are below:
- Games Won – Simply the total number of games won by the player.
- Winnings Per Game – The player’s average total winnings for a game. I modified this from total winnings to create a rate-based stat that eliminated bias caused by longer streaks vs shorter streaks.
- Correct Answer Percentage – The percent of the time the player was correct when successfully buzzing in.
- Final Jeopardy Correct Rate – The percent of the time the player got the Final Jeopardy answer correct.
- Attempt Rate – This is the derivative stat I created, and the one I believe is the key to this model. My hypothesis is that any time the player attempted to buzz in, they were likely to know the answer, regardless of whether or not they were successful. By converting the number of attempts listed in the box score to a percentage based on the total number of clues and averaging it on a per-game basis, I was able to create an assessment of the player’s overall knowledge with this stat.
- Buzz Percentage – The percent of the time the player’s buzzer attempt was successful, used as the key indicator of buzzer skill.
- Correct Answers Per Game – An average number of clues answered correctly per game, indicating a combination of knowledge and buzzer skill.
- Daily Double Percentage – The percentage of the time the player correctly answers Daily Doubles
- Daily Double Leverage – Another derivative stat, this combines the player’s wagers and game situations to give an overall indicator of DD hunting and wagering skill.
- Runaway Rate – My final derivative stat, I created this based on looking at the scores before Final Jeopardy. This calculates the percent of the time the player was winning by such a large margin that they could not be caught in Final Jeopardy, indicating an especially dominant performance.
- Lost to ToC Player – I added an indicator to give a small boost to a player who’s streak was ended by another player who qualified for the Tournament of Champions, assuming that a player who was defeated by another very skilled player was more skilled than someone who lost to a less-skilled player.
Using R, I wrote a script to combine these stats and give each player an overall Power Ranking. Once each player was assigned a Power Rank, I used this number to calculate the likelihood of each player winning in their specific Tournament of Champions matchup. In addition, I ran 10,000 simulations of the tournament as a whole to determine the chances for each player to win the entire tournament.
Results:
Game 1 – This game offered a very intriguing matchup right off the bat. The Power Rankings had Aaron Levine as one of the weaker players in the tournament. However, it had identified Tom Devlin as very skilled, 3rd overall in the Power Rankings with a score of 62.2, compared to Liam Starnes who checked in 13th overall at 45.3. Starnes was the overall 5th seed in the tournament and had a streak of 6 wins compared to Tom’s 3 as well as significantly more total earnings and thus would appear to be the favorite, but my model gave Tom a whopping 85.7% chance to win. Tom indeed made good, winning the game in a runaway.
Game 2 – We had another big mismatch here. Andrew Hayes came in as the top seeded player (4th seed overall) on the strength of his 6-game win streak and more than $118,000 in total earnings. 4-game winner Allegra Kuney however had better underlying stats, and my model gave her a 49.1% chance to win compared to Hayes’ 34.9% shot. Kuney was victorious, again with a dominating runaway victory.
Game 3 – This was the first game my model missed, but rather than variance, I believe this miss to be based on a flawed assumption that I can correct. Ben Ganger was the top ranked player by my model, 8th in my power ranks and carrying a 54.8% chance to win into the game. He was also the favorite based on win streak and earnings. He did not win, as Cameron Berry, the Second Chance Wildcard winner emerged victorious. Berry was 14th in my Power Ranks, but I believe my model did not accurately capture his skill. Unlike all other players in the tourney, Berry qualified by winning the Second Chance Wildcard tournament. I treated his stats from his play this tournament the same as the regular winners win streaks, however, as die-hard Jeopardy! fans know, tournament play is a different beast from “regular season” Jeopardy!. Berry faced tougher competition and tougher questions during his run through the Wild Card tourney, and his victory here proves that it takes more skill to win a post-season tourney than to make a regular season win streak. Future versions of this model will add a boost to stats accumulated during tournament play. I believe that if my model had accounted for this skill boost this game would have been a coin-flip probability between Berry and Ganger.
Game 4 – My model also missed this game, but this one was pure variance. Alex DeFrank was 4th in my overall Power Rankings, and carried a 63.5% chance of winning based on my model. He was dominant for the vast majority of the game, but absolutely crumbled in the final moments, losing two huge Daily Double bets and flailing with a number of incorrect answers in the final clues. He then proceeded to bet it all on his incorrect Final Jeopardy answer. Winner Ashley Chan just stayed silent and bet nothing on FJ to secure her victory. DeFrank was clearly the better player as my model indicated, but a vicious combination of bad luck and a meltdown like one I’ve never seen on the Alex Trebek stage cost him the victory.
Game 5 – The first game without a clear favorite. Three very evenly-matched players competed here, with Matt Massie getting a 37.8% chance of winning, Steven Olson with 36.6%, and Josh Weikert with 25.5%. Olson was the winner in a close game, and while technically my model gave Massie an extra 1.2% chance of winning, what it really predicted was a very close game with either Olson or Massie taking the win, which is exactly what happened.
Game 6 – This game featured the weakest field so far, with TJ Fisher at 11th in my power ranks coming in as the favorite with a 64% chance to win from my model. The other two contestants were in the bottom 5 of my Power Ranks. Fisher also would have been a strong favorite based on the surface stats so this prediction is not particularly impressive, but he won the game handily as expected.
Game 7 – The first game of the semi-final round featured Scott Riccardi, the clear overall favorite based on his massive 16-game win streak that put him inside the top 20 on the all time Jeopardy! money list. Riccardi was the overall tourney winner in just over half of my full tourney simulations, and while he faced very strong competition based from Tom Devlin and Allegra Kuney based on my power ranks, he carried a 56.3% chance to win into the game and delivered with the win.
Game 8 – Laura Faddah was the #2 seed overall in the tourney on the strength of her 8-game winning streak and received a first round bye in the tourney, but my model had her ranked last of the “real” players based on the underlying stats. The model gave her just a 12.9% chance to win, basically making it a toss up between Steven Olson (44.9%) and TJ Fisher (42.3%). Fisher pulled off the victory running slightly contrary to the model, but again, the model predicted a close game between Olson and Fisher with Faddah out of the running, which is exactly what happened.
Game 9 – Paolo Pasco was the 3rd overall seed, earning the final first round bye, and he was 2nd overall in my Power Rankings behind Riccardi. Pasco would have been the easy favorite based on surface stats, and my model also had him as a prohibitive favorite here, bringing a 63.8% chance to win into the game. He made good on his status, winning easily to send himself to the finals.
Finals – The ToC finals are a series of games between the finalists, with the first player to win 3 games being the victor. My pre-tourney simulations had the top Power Ranked player Scott Riccardi winning outright 51% of the time, with Paolo Pasco winning 26.7% of the time. For each individual game, Riccardi had a 54.1% chance to win compared to Pasco’s 31.8% and Fisher’s 14.1% shot. Whether Riccardi was rusty from the long layoff of is Pasco just practiced more, Pasco was absolutely dominant, finishing it off in the minimum 3 games with an absolutely dominant performance. While my model had Riccardi as the likely champ, it gave plenty of respect to Pasco as well, giving him by far the best chance of any player to knock Riccardi off.
Conclusions – This model was very accurate. Outside of the Finals, it predicted 9 games, 2 were toss-ups, and the model was 5-2 on the remaining 7 games, including predicting multiple upsets. While it was not perfect, these of course are probabilities for outcomes that are determined by the specific questions, a bit of luck, and as we learned from Alex DeFrank – a human being under a lot of pressure both from being on TV as well as the large sum of cash at stake. Modeling can find trends and stats, but the player’s ability to handle being under the bright lights will always be a wild card. Given the model’s ability specifically to correctly identify skilled players who might be overlooked or undervalued based on Jeopardy’s seeing methodology and the more rudimentary statistics available, it has significant value in digging below the surface to find what really makes a Jeopardy! champion great.
See below for the full R Power Ranking script:
#############################
# Jeopardy Power Rank Script
#############################
# Load required libraries
library(dplyr)
library(readr)
#############################
# 1. READ INPUT DATA
#############################
df <- read_csv(“player_stats.csv”)
#############################
# 2. HELPER FUNCTIONS
#############################
# Z-score normalization
z_score <- function(x) {
(x – mean(x, na.rm = TRUE)) / sd(x, na.rm = TRUE)
}
# Normalize percent columns (handles 0–1 or 0–100)
normalize_pct <- function(x) {
if (max(x, na.rm = TRUE) > 1) x / 100 else x
}
#############################
# 3. NORMALIZE & STANDARDIZE
#############################
df_ranked <- df_norm %>%
mutate(
power_rank =
0.09 * games_won_n +
0.13 * winnings_pg_n +
0.15 * accuracy_n +
0.08 * fj_n +
0.15 * attempt_n +
0.05 * buzz_n +
0.10 * correct_pg_n +
0.07 * dd_accuracy_n +
0.09 * dd_leverage_n +
0.06 * runaway_n +
0.03 * lost_to_toc_n
) %>%
arrange(desc(power_rank))
#############################
# 4. CALCULATE POWER RANK
#############################
df_ranked <- df_norm %>%
mutate(
power_rank_raw =
0.09 * games_won_n +
0.13 * winnings_pg_n +
0.15 * accuracy_n +
0.08 * fj_n +
0.15 * attempt_n +
0.05 * buzz_n +
0.10 * correct_pg_n +
0.07 * dd_accuracy_n +
0.09 * dd_leverage_n +
0.06 * runaway_n +
0.03 * lost_to_toc_n
)
#############################
# 5. RESCALE TO 0–100
#############################
df_ranked <- df_ranked %>%
mutate(
power_rank = 50 + 10 * z_score(power_rank_raw)
)
#############################
# 6. FINAL SORTED OUTPUT
#############################
final_rankings <- df_ranked %>%
select(player, power_rank) %>%
arrange(desc(power_rank))
print(final_rankings)