The Virginia Tech CFP Selection Show
Only in the world of statistical modelling can these games happen
One of the reasons college football is so fun to follow and discuss is because answering some of the most basic, and interesting, questions is impossible. At most, the National Champion in a given year plays 15 games. In years past that number was less. The College Football Playoff was a step in the right direction, but even when it expands, so many questions will remain unanswerable.
For example, can anyone name, with certainty, the best college football team of all-time? 2019 LSU looked unstoppable. So did 1995 Nebraska. Who would win a matchup of the two? We’ll never know!
But thanks to the wealth of data now tracked and easily accessible (with some programming skills), we can build models that predict how such games would go, at least among teams from the last 20 years.
As always in the world of analytics, the availability of data dictates the conclusions we can, and cannot, draw. For the purposes of this exercise, which aims to crown the best Virginia Tech football team since the 1999 National Championship team, data availability is sufficient to support an analysis of the last 20 Virginia Tech teams. That means, unfortunately, the Michael Vick-led 2000 team that went 11-1 and finished #6 in the final AP poll, as well as the 2001 and 2002 teams, will be excluded. Still, that leaves nine 10-win teams, and a few others that came close.
The Format
The Virginia Tech CFP will hew closely to the format scheduled to take effect in by the CFP in 2024. That means 12 teams (or years of VT teams, in this case) will make it to the playoff. The top 4 teams will get a bye, and rounds one and two will be played on campus (with the home team getting the full Lane Stadium statistical advantage). The semi-finals and finals will be played at a neutral site.
The Big Reveal
I went into this exercise with a pretty good idea of which teams would make the cut. In fact, I thought about just making a subjective list. However, that would be running counter to the mission of Hokie Analytics, which is to quantify and analyze all things Virginia Tech football. Therefore, I settled on a BCS-style set of computer rankings, each taking a slightly different perspective on what makes for a “great team”. Those metrics are:
Football Power Index (FPI) - a predictive rating system developed by ESPN that measures team strength and uses it to forecast game and season results (created in 2005)
SP+ - a tempo- and opponent-adjusted measure of college football efficiency created by Bill Connelly (ESPN)
ELO - a rating system originally developed for chess players that rates each team based on their wins and losses
Simple Rating System (SRS) - a rating that takes into account average point differential and strength of schedule; the rating is denominated in points above/below average, where zero is average
Each metric has its unique strengths and weaknesses, so for the sake of simplicity, I weighted the rank of each Virginia Tech team’s rating an equal 25% across the board. So, let’s say that the 2022 team’s ratings, when compared to the other 19 VT teams’, were: 20, 18, 18, and 20. That would result in an average ranking of 19. Wherever that average ranking fell in comparison to the other teams would be the seeding for the 2022 team, e.g., if every other team had a numerically lower average ranking, then the 2022 team would be seeded 20th, which in a 12-team field means they stay home for the holidays.
The top 4 Seeds (bye in the first round, home field advantage in the second round)
The following teams, listed in order of seeding, secured first round byes and will host quarterfinal matchups:
VT 2005 - ACC Coastal and Gator Bowl Champions, 11-2
VT 2009 - ACC Coastal Runners-up and Chick-fil-A Bowl Champions, 10-3
VT 2010 - ACC Champions, 11-3
VT 2004 - ACC Champions, 10-3
Seeds 5-8 (home field advantage in the first round)
VT 2007 - ACC Champions, 11-3
VT 2006 - ACC Coastal Runners-up, 10-3
VT 2016 - ACC Coastal and Belk Bowl Champions, 10-4
VT 2003 - 4th place in the Big East, 8-5
Seeds 9-12 (away team first round)
VT 2017 - ACC Coastal Runners-up, 9-4
VT 2011 - ACC Coastal Champions, 11-3
VT 2008 - ACC and Orange Bowl Champions, 10-4
VT 2015 - 4th place ACC Coastal and Independence Bowl Champions, 7-6
Teams that did not make the cut
VT 2013 - ACC Coastal Runners-up (tie), 8-5
VT 2020 - 8th place ACC, 5-6
VT 2014 - 5th place (tie) ACC Coastal and Military Bowl Champions, 7-6
VT 2012 - 4th place ACC Coastal and Russell Athletic Bowl Champions, 7-6
VT 2019 - ACC Coastal Runners-up, 8-5
VT 2018 - 3rd place (tie) ACC Coastal, 6-7
VT 2021 - 3rd place (tie) ACC Coastal, 6-7
VT 2022 - 6th place (tie) ACC Coastal, 3-8
Controversy
Fans of the 2013 team are already up in arms about their snub. Yes, they were ranked as high as 13th nationally at one point in the season, but they faded in the second half of the season and got blasted by UCLA in the bowl game. There was also a stunning lack of talent on offense, especially at wide receiver.
If any team’s outrage is justified, it has to be the 2019 Hokies. That team started slow, then almost came all the way back to win the Coastal, which would have merited an Orange Bowl birth because Clemson made the playoff that year. I thought they were a dark horse to make the VT CFP, and I could have seen it going either way, but 17th place out of 20? Ouch!
Finally, I do want to offer some comment about the most accomplished team in the field - the 2008 squad. That team, though (severely) offensively challenged, won both the ACC Championship and the Orange Bowl, making it the only VT team in history to win its conference outright and win an Alliance/BCS/New Year’s Six bowl (the 1995 team was a co-Big East Champion and received the Sugar Bowl berth because the other co-Champion, Miami, was ineligible for postseason play due to NCAA rules violations). That 2008 team barely snuck into the Virginia Tech CFP with an 11 seed. I know 2008 was a ridiculously down year for the ACC, and that team was young and at times struggled mightily to move the ball, but this seems extreme.
Ok, without further ado, here are the final playoff rankings from each of the metrics:
Looking Ahead to First Round Games
I am still building out what I believe to be a much more intricate model than I used during the 2022 regular season. Using a host of advanced statistics, I seek to accurately predict not only scores, but also standard statistics to help describe how the game went. Did one team rack up a bunch of rushing yards or live in the opposing team’s backfield? I want to know! The point of this exercise is to quantify not just what would happen, but also how it would likely happen. The date of play for first round games is TBD - perhaps next Friday, but if not, hopefully the week after. It depends on how quickly I can finish constructing the model and how crazy things get in the transfer portal (therefore necessitating some analysis and an interlude article). There are four rounds of games, and I am hoping to run them over the four-week bowl season as a fun distraction from this second postseason in three sans VT football.
Here is an early look at the first round matchups:
#12 VT 2015 (7-6) at #5 VT 2007 (11-3)
#11 VT 2008 (10-4) at #6 VT 2006 (10-3)
#10 VT 2011 (11-3) at #7 VT 2016 (10-4)
#9 VT 2017 (9-4) at #8 VT 2003 (8-5)
Alternates Policy
Since I have not yet finished building the model, I cannot say for sure that all of the seeded teams will have enough data to play. The 2003 team’s lack of standard season stats is an ominous sign. All data are sourced from collegefootballdata.com or espn.com via the R API wrapper cfbfastR. If the datasets used to power the model are not sufficient for any seeded team, that team will be replaced by the highest ranked team currently outside of the playoff. There is no reseeding. For example, if the 2003 team is a no-go, it would be replaced by the 2013 team, which would step into the #8 seed and play the #9 seeded 2017 team. In reality, I think the only teams at risk of not being able to play are 2003, 2004, and 2020 (there is a bit of weirdness in some datasets due to Covid).