An Interview with
Colley's Bias Free College Football
2005 (begun in 2002)
YL: What are the most important factors in rating
Wes: For the more typical team sports, such as
football and basketball, you can try to use score or something, or even
secondary stats such as scoring defense, but I tend to believe that winning is
the name of the game, and therefore stick with wins and losses as the basis of
Wins and losses alone work very well for rating
professional teams, because the number of games is large (relative to the number
of teams), and the level of the teams is quite similar. In college sports, those
two criteria are not nearly met, and a strength of schedule component MUST be
YL: Do you have any plans to rate the NFL this way?
Wes: I have done so, and essentially, it's only interesting in the early
part of the season. By the later part of the season, the records tell you
everything you need to know. The reason is that you have very strong
parity, and only 32 teams which play 16 games, as opposed to the college
situation in which you have very little parity (between, say, USC and Buffalo),
with 117 teams playing 11 games. In short, strength-of-schedule just isn't
that big a correction in the NFL.
I do run my rankings for college basketball, where strength-of-schedule is a big
factor. My system outpicked the committee in the recently completed NCAA
Tournament. There were 19 upsets (lower seed beats higher seed), but only 15
using my rankings. (http://www.colleyrankings.com/hoops0405/bracket.html)
YL: Do you think the Strength of Schedule component
is a bit overplayed since ratings and rankings already include them and the
polls certainly consider how strong team's opponents are?
Wes: The strength of schedule
"column" was removed from the BCS for the 2004 season. The system was
simply (AP+Coaches+[Computer Avg.])/3. It's slightly more complicated than that,
but that's the idea. No "tweak" columns were in place.
YL: I guess I lost track of what the BCS was doing.
They actually did something to improve it and it went right past me. I guess
I've gotten to the point where I try to ignore them. Now I'll have to watch them
before last, people blamed the ratings. Last season the redundant
schedule strength was removed and they blamed the polls. See how much better it is?
Wes: The AP pulled out, via a strange cease-and-desist letter to the
BCS that says, essentially, "in order for us to complain about the BCS with
impunity, please remove our rankings from the BCS." So now you're
down to the coaches and computers. I think the BCS folks meet this month,
so we'll see what happens.
Frankly, I thought the BCS ranking system worked perfectly last season. Southern
Cal was obviously the best team and won the title. Auburn got a Sugar Bowl
trophy. Until you start talking about a 4-team playoff or something, the
BCS could not have done any better in seeding the 2-team playoff the bowl system
YL: I guess we'll never all be happy until they all have to play each
other. Is your system predictive
Wes: Well, retrodictive. I've used the words
"it hindcasts rather than forecasts," but "retrodictive" is
YL: Retrodictive has somehow become the chosen term - I once
read that the guy who introduced it to sports ratings used it because he
couldn't come up with a better term at the moment.
When and why did you begin to rate teams?
Wes: When and why:
When: 1998, the first year of the BCS.
Why: a very strong interest in college football... In 1998
when the BCS came around, I thought most of the computer rankings were
unnecessarily complicated and clouded in mystery, so I decided to create a
simple, open system, which people could understand and check. I particularly
wanted one with no conference weighting factors, or pre-season information. I'm
delighted that some people have gone to the trouble of checking my system and
verifying its results.
YL: What is the future of computer ratings and the BCS?
Wes: Good question. I basically have no idea. Every
year there seems to be some misplaced pre-bowl outrage, then some hindsight
post-bowl outrage, frequently mutually exclusive to the former, but at the end
of the day, agreement that the correct national champion was crowned. As long as
that's the case, I doubt the BCS will go away.
YL: Do you have favorite teams or sports?
Wes: I'm an alumnus of UVa, so it's not surprising
that UVa is my favorite team. I go to all the home football games and
usually an away game or two. I also follow UGa football very closely.
My brother played football there, so my family naturally became involved with
YL: I really like the concept of your system. I
developed practically the same thing once and I thought it was great. Then one
day Todd Beck sent me an email and said my ratings were in the BCS. It was then
that I discovered that my great, new concept wasn't new, after all. I
know from my own almost-identical system that a good team beating a lousy team seems to be
penalized and a lousy team losing to a good team seems to get rewarded. Do
you use Strength of Schedule to counteract that or do you think that good teams
should play good teams? Also, do you think, as I do, that using your rating
system as the standard would push teams into playing more logical opponents so
that they could get a better rating, thereby earning their position?
Wes: It's true that if an already good team beats an already lousy
team that the good team may suffer a small ratings decline and the lousy team
may receive a small ratings bump. However, there are two important things
to keep in mind here:
1. That statement requires consideration of chronology, which my system is
explicitly designed to ignore. Namely, if the game in question had been
the first game for each team, that scenario would have been impossible, because
neither team would have been previously established as good or lousy; both would
have had an identical rating of 0.5, as do all teams before the first game.
My system looks at the season as a whole, irrespective of whether the game was
in August or November. Because of that, you can play a team that's so bad
that it makes a statement about your entire season's schedule strength, and
2. The teams have to be more than 0.5 apart in the previous week's
rankings, so that really only occurs when teams in the top 15 or so play teams
in the bottom 20 or so. And yes, you have to ask yourself why top 15 teams
are playing bottom 20 teams.
YL: We can see that you are very thoughtful and analytical about
ranking teams. Is this something we would see in your profession? What keeps you
busy when you're not ranking teams?
Wes: I am a senior research scientist at the Virginia Modeling
Analysis and Simulation Center of Old Dominion University, where I work mainly
on Department of Defense and Department of Homeland Security research.
YL: Where do you call home?
Wes: The Hampton-Roads area of Virginia.
YL: What else can we get you to tell us about Wes Colley?
Wes: Come by my tailgate!