Elo rating (idea) by Professor Pi

The Elo rating system is a numerical rating system in chess to compare the performance of individual players. It is a common misconception that the letters "ELO" in the Elo-rating system are some sort of abbreviation; the system was named after the Hungarian-American Physics Professor Arpad Elo.

Chess was only one of the many hobbies of Dr. Elo, although he was quite a respected player at the Master Level. He won over forty tournaments, including eight Wisconsin State Championships. But Dr. Elo was also involved with the chess community in other ways; he was the president of the (old) American Chess Federation from 1935 to 1937, and he was a co-founder of the United States Chess Federation (USCF) in 1939.

Before the adoption of the Elo rating system, there were several other rating systems in use, but they were not considered to be very accurate. The USCF was using a rating system developed by Kenneth Harkness. In this system, 1500 points marked an average player; 2000 points a strong club player and 2500 points a grandmaster player. Dr. Elo more or less retained the existing level-range, but he provided a much sounder statistical basis for comparing the individual player scores.

The Elo rating system was adopted by the USCF in 1960, and in 1970 by the World Chess Federation, FIDE. Until 1980, Dr. Elo was in charge of all the calculating all the ratings for FIDE, using nothing more than a Hewlett-Packard calculator. The concept of the Elo ratings proved to be quite useful, and it has been adopted to other sports as well (e.g. tennis, golf)

The Elo rating is based on the statistical concept of win expectancy. The outcome of a chess game (or any sporting event) is not a constant, but it exhibits a certain distribution around an average (think of a an athlete competing in the long jump; not every jump will be the same distance). The Elo rating number represents a certain probability for a player to win against another player. Or to be more precise, the difference in Elo ratings between two players is a measure of the expected outcome of a match between the two.

The concept is best explained with an example; Garry Kasparov's current Elo rating is 2838. Nigel Short's Elo rating is 2675. The difference in Elo ratings is 2838-2675=163 Elo points. This difference corresponds to a win expectancy of 72% for Kasparov. If Short and Kasparov would play a match consisting of 10 games, the expected outcome of the match would be close to 7-3 in favor of Kasparov (7.2-2.8 to be exact)

The win expectancies for the Elo rating were designed to follow the Gaussian Distribution. Every rated player has an Elo number that represents an average playing strength, with an associated (but fixed) standard deviation. The win expectancies as a function of Elo difference points can be found in the following table.

Win expectancies (Exp.) as a function of Elo difference points (Diff.)
between two rated players
---------------------------------------------------------------
Diff.  Exp. |   Diff.   Exp. |  Diff.   Exp. |  Diff.   Exp.
---------------------------------------------------------------
 0-3   .50  |   92-98   .63  | 198-206  .76  | 345-357  .89
 4-10  .51  |   99-106  .64  | 207-215  .77  | 358-374  .90
11-17  .52  |  107-113  .65  | 216-225  .78  | 375-391  .91
18-25  .53  |  114-121  .66  | 226-235  .79  | 392-411  .92
26-32  .54  |  122-129  .67  | 236-245  .80  | 412-432  .93
33-39  .55  |  130-137  .68  | 246-256  .81  | 433-456  .94
40-46  .56  |  138-145  .69  | 257-267  .82  | 457-484  .95
47-53  .57  |  146-153  .70  | 268-278  .83  | 485-517  .96
54-61  .58  |  154-162  .71  | 279-290  .84  | 518-559  .97
62-68  .59  |  163-170  .72  | 291-302  .85  | 560-619  .98
69-76  .60  |  171-179  .73  | 303-315  .86  | 620-735  .99
77-83  .61  |  180-188  .74  | 316-328  .87  | > 735   1.0
84-91  .62  |  189-197  .75  | 329-344  .88  | 
--------------------------------------------------------------

Of course chess tournaments and matches usually don't end up exactly like the statistics would predict. Otherwise there wouldn't be any point in playing the matches in the first place! This is where the adjustments to the Elo ratings come into play: players are rated on the outcome of their matches against other players. Getting back to the example of the Kasparov-Short match; suppose these players finish the match, with an outcome of 6-4 for Kasparov. Even though Kasparov won the match, he didn't score as high as was predicted by the Elo difference. As a result, Kasparov's Elo rating will drop. And even though Short has lost the Match, his Elo rating will increase.

In a single match between two players, the rating change is:

ΔR = K (W - W_e)

ΔR is the rating change for each player. K is called the Development Coefficient; this factor determines how much an Elo rating is adjusted, based on the outcome of the match. The value for K=25 for new players (played in matches with a total of less than 30 games), K=15 for players with an Elo rating below 2400, and K=10 for players with an Elo rating at or above 2400. W is the score achieved, and W_E is the expected score.

In the Kasparov-Short example, Kasparov scored only 6 points (W), where he was expected to win by 7.2 points (W_e). Kasparov's rating changes by:

ΔR = 10 (6 - 7.2) = -12 points

Similarly, Short was expected to lose by 2.8 points, thus his rating changes by:

ΔR = 10 (4 - 2.8) = +12 points

So after the match, Kasparov's Elo rating drops to 2826 points, and Short's Elo rating increases to 2687 points. Please note that there are additional regulations, for instance for playing against unrated players, and for tournament play. Also note that the FIDE updates player ratings every six months, so the outcome of one single match will not affect the Elo rating immediately. Of course, the Elo ratings do not supply any information on the individual aspects of a chess player's capabilities; it doesn't rate the individual style of a player, or how well his defense and end game are. It was in fact Dr. Arpad Elo himself who recognized the limitations of any rating system and the difficulties to objectively quantify player strength:

Often people who are not familiar with the nature and limitations of statistical methods tend to expect too much of the rating system. Ratings provide merely a comparison of performances, no more and no less. The measurement of the performance of an individual is always made relative to the performance of his competitors and both the performance of the player and of his opponents are subject to much the same random fluctuations. The measurement of the rating of an individual might well be compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yard stick tied to a rope and which is swaying in the wind. -- Dr. Arpad Elo, Chess Life (1962).

Nevertheless, the Elo rating system has proved to be a relatively accurate measure for predicting the outcome of chess matches, based on a quantified figure of the strenght of individual chess players.

factual sources:
http://handbook.fide.com/handbook.cgi?level=B&level=02&level=10& (official FIDE rules)
http://www.bio.vu.nl/vakgroepen/microb/reijnders/elo.html
http://www.chesslinks.org/chess/hof/elo.html
http://members.aye.net/~jbdiablo/chessmasters/rateexplain.htm
http://www.ping.be/dwarrelwind/kbsb/watiselo.html

Endgames in Chess	ELO	chess rating	Grandmaster
Deconstructing Harry	FIDE	Bobby Fischer	Gaussian Distribution
Interview with a Grandmaster	Peter Leko	FIFA world rankings	Glicko rating system
i have to get out	Rybka	Tactical chess	A Bust to the King's Gambit
normal distribution	chess	January 6, 2021	super grandmaster
lichess	average centipawn loss	Unicode European Alphabets	Nigel Short