|
An Interview with
Jeff Bihl
Jeff's Computer
Ratings
NCAA Football and Basketball, NFL
2001-2002
YL:
Last year I tried to get Kenneth Massey to do a comparison page with college
basketball as he has been doing with college football. He wasn't sure he could
devote the time to it. Now I see that not only has he begun one but that you
have one, too. How long have you been tracking college basketball in this
format? Do you plan to expand to other sports?
Jeff: I
started the comparison earlier this year. I think only about a week or two
before Kenneth. I thought that a basketball comparison would be interesting and
I asked Kenneth if it was OK to use a similar format. I think that reminded him
he had been planned on setting one up. I kept putting mine up because once the
parser is written it isn't hard to maintain. I might try another sport sometime.
The parser would be easy to convert. It's really only interesting if you have at
least 9 or 10 ratings and I think that college computer rankings are a lot more
interesting than pro sports for several reasons.
YL: I've
tried a different approach with my Superlists - ranking the teams with the
median rank and not an average. A lot of teams would be a little difficult for
me and wouldn't make enough difference to expend the effort. Different sports
team raters have different ideas of what the most important factors are. What do
you consider to be the most important factors?
Jeff: I
can see how one would say that average ranking is not the best way. I was going
to average team ratings instead but that leads to other problems such as how to
scale each system. I have an average of the average ranking of the teams in each
conference. The leads to even more problems. There may be more difference in
strength in Division 1 Basketball between team #1 and #30 than between team #90
and team #230 but the latter will have a much bigger effect on an average.
Rating would be a better option here if it were more feasible.
I post result from two different systems with two different
purposes that work on vastly different criteria. One is a pure predictor and the
other is more of a retrodictor. The predictive criteria are not really
subjective but were achieved through testing to see which parameters resulted in
the smallest mean error on past data. The retrodictive system works by trying to
answer the question of how difficult a team's win/loss record was to achieve
against their schedule. The retrodictive system's criteria would be my main
criteria if I voted in a poll. It ranks team based on what teams they defeated
and who defeated them instead of who would win an upcoming or hypothetical
matchup. These are two completely different things and that is why I have
two systems. The retrodictive system is a lot more complicated than the
predictor but it is not pure retrodiction
as its intention is not to reduce ranking violation percentage.
YL: If
you had to narrow it down to one perfect system, what would be the most
important factors?
Jeff: I
can really give a single answer to that. I don't think you have to say a system
is good or bad because it may be good at one thing and bad at another. If you
must judge a system you should judge it against the purpose for which it was
written. I scaled the rating of my retrodictive system so that it would best fit
a projected margin of victory but the system would make a very poor predictor if
it were actually used for that. For the sake of a system that I would use for
the purpose of the BCS then I would say wins and losses and strength of
schedule. I think strength of schedule for this purpose can only be evaluated on
a game-by-game basis. Averaging the strength of the teams on the schedule is
oversimplifying. Schedule strength can be relative with respect to the strength
of the team to whom the schedule belongs based on how the individual strengths
of the teams on the schedule break down.
YL: What
do you see as the future of computer ratings?
Jeff: The
public still has not accepted computer rankings. For that to happen the press
will have to become friendlier to them and the public will have to stop applying
computer rankings for a purpose for which they were not written. I will use the
general predictive vs. retrodictive comparison as an example. There may be a
hard luck team with several close losses and several blowout wins that may
really be the "best team" even tough there are teams out there with
better records against similar schedule strength. If someone approaches a
predictive ranking with the same mindset as the do the major media polls their
opinion of computer systems in general will be soured. They probably either
consciously or subconsciously recognize that "best team" and
"most deserving team" are not one and the same but if they approach
the ranking with the wrong mindset they won't accept it. I remember Florida
being favored to win the Sugar Bowl over Florida St. at the end of the 1996
season even though Florida St. was undefeated and had already defeated Florida.
Obviously, for Florida to be favored more than half the public believed they
would win. Nobody complained that was unjust. However, at the same time many
people would have found a ranking where Florida had a #1 next to their name
prior to the rematch totally unacceptable. The concept of #1 already has a
preset meaning. The reverse is true but less likely. If a retrodictive type
system were used to try to predict outcomes it would usually not work very well
and possibly sour a person on how well computers can predict games.
Unfortunately, it is probably up to the press to explain that. I don't think
most of the press understands that and if they did they would not want any
change. For some reason the accepted ranking standard in college is a poll if
the press. A press poll has its merits but I fail to see how anyone could say
that it is the only possible method. The press used their power to convince the
public of their own qualifications.
YL:
What
interests do you have outside computer ratings?
Jeff:
I like computers and sports so the computer rankings were a natural
interest. So are sports video games. Really, I just like games of all kinds.
YL: When
did you begin to rank teams?
Jeff: I
was pretty young when I first tried computer rankings. I first tried them in the
mid-1980's on my Commodore 64 mostly using hypothetical situations or sometimes
data from single high school conferences. I realized that the convergence of a
system with a large number of teams would take too long for my patience on a
Commodore 64. It was not until I got to college and had access to faster
computers (486 systems) that I really started thinking about running a computer
system on a entire 1A football schedule. The 1994 poll controversy piqued my
interest enough to buy a USA Today and tediously convert an entire 1A football
schedule and results into a entirely numeric data file. I posted weekly results
on my dorm room door during the 1995 season. I first put the results for both
the NFL and college on the internet for 1996 along with scores from a separate
predictive system. I started college basketball near the end of the 1999-2000
season.
YL: Do
you favor particular sports or teams?
Jeff: I still have some affinity
for the Reds and A's left over from when I was young.
YL: What
is your profession?
Jeff: Control
Systems Engineer.
YL: Where
do you call home?
Jeff:
Right now Ohio is home but that will likely change again soon.
YL: Good
luck. I hope it's a change you will find welcome. Is there anything else you
would like us to know about you?
We would like to thank Jeff for the
interview. You can visit his site at http://www.zoomnet.net/~sbihl/rankings.html
|