admin管理员组文章数量:1356522
Glicko-2 is a rating system used in chess, but can be used in many other situations. Glicko-2 is an improvement on Glicko-1, which addressed problems of the older ELO rating.
What makes Glicko-2 special in parison to version 1 is that it incorporates a higher rating deviation (RD) the longer someone has been inactive. It does this with the notion of a system constant which relates to time/rating periods.
An example write up from the author is found here: .pdf.
Within this document, he explains:
The Glicko-2 system works best when the number of games in a rating period is moderate to large, say an average of at least 10-15 games per player in a rating period. The length of time for a rating period is at the discretion of the administrator.
Making an assumption that a group of active chess players play 10-15 games on average in a 1 month time period, the administrator would then update ratings at the end of every month.
I needed a PHP Implementation of the Glicko-2 rating system and came across the following:
Glicko-2 JavaScript Implementation
- The JavaScript had a small error, in which didn't let it match the technical write-up example, the author found it close enough, and didn't bother to debug.
Glicko-2 PHP Implementation
- The PHP implementation was plagued with many bugs, but that wasn't apparent unless you did more than one rating period (which the technical write-up never shows expected values of)
Glicko-2 Calculator in Excel
- Finally the Excel calculator seemed to be error-free and the most professional, done by someone in the chess munity. Once the JavaScript bug was solved, the JavaScript and Excel Calculator matched very closely with each other (albeit not perfect, could be within rounding error)
I had fixed the bugs (and submitted issues/patches to the authors) I could find on the PHP and JavaScript versions to match as closely to the Excel Calculator
Now I am 99% confident that I have an accurate Glicko-2 implementation (between the 3 of them) for analysis and that is when I came across something strange, and the topic of this discussion.
Given the suggested default for Glicko-2 for a new player:
Rating: 1500
RD: 350
Volatility: 0.06
If you face an average opponent of rating 1378 and RD 99 (Source) only once every rating period (1 month) for the next 12 periods (1 year) you will have accumulated an assumed National Class A (1800-1999) rating of 1852 when in reality you have only beat 12 average rated players over a span of 12 months.
Month Rating RD Volatility Class
1 1625 259 0.059999 National Class B
2 1682 225 0.059998 〃
3 1718 205 0.059997 〃
6 1784 174 0.059994 〃
12 1852 148 0.059988 National Class A
24 1922 127 0.059976 〃
If you face 2 average opponents every rating period, you can get to National Class A about 4-5 months, facing only 8-10 average opponents.
Month Rating RD Volatility Class
1 1672 215 0.059999 National Class B
2 1733 183 0.059997 〃
3 1770 166 0.059995 〃
4 1797 154 0.059993 〃
5 1819 146 0.059992 National Class A
6 1836 140 0.059991 〃
Are these assumptions accurate? Is there a bug in my calculator?
If it is not a bug, what are some ways of countering this besides:
- Consider "true rating" to be lower bound of the deviation (Rating - RD)
- Do not show inactive user's rating
- Do not show users with less than N games
Glicko-2 is a rating system used in chess, but can be used in many other situations. Glicko-2 is an improvement on Glicko-1, which addressed problems of the older ELO rating.
What makes Glicko-2 special in parison to version 1 is that it incorporates a higher rating deviation (RD) the longer someone has been inactive. It does this with the notion of a system constant which relates to time/rating periods.
An example write up from the author is found here: http://www.glicko/glicko/glicko2.pdf.
Within this document, he explains:
The Glicko-2 system works best when the number of games in a rating period is moderate to large, say an average of at least 10-15 games per player in a rating period. The length of time for a rating period is at the discretion of the administrator.
Making an assumption that a group of active chess players play 10-15 games on average in a 1 month time period, the administrator would then update ratings at the end of every month.
I needed a PHP Implementation of the Glicko-2 rating system and came across the following:
Glicko-2 JavaScript Implementation
- The JavaScript had a small error, in which didn't let it match the technical write-up example, the author found it close enough, and didn't bother to debug.
Glicko-2 PHP Implementation
- The PHP implementation was plagued with many bugs, but that wasn't apparent unless you did more than one rating period (which the technical write-up never shows expected values of)
Glicko-2 Calculator in Excel
- Finally the Excel calculator seemed to be error-free and the most professional, done by someone in the chess munity. Once the JavaScript bug was solved, the JavaScript and Excel Calculator matched very closely with each other (albeit not perfect, could be within rounding error)
I had fixed the bugs (and submitted issues/patches to the authors) I could find on the PHP and JavaScript versions to match as closely to the Excel Calculator
Now I am 99% confident that I have an accurate Glicko-2 implementation (between the 3 of them) for analysis and that is when I came across something strange, and the topic of this discussion.
Given the suggested default for Glicko-2 for a new player:
Rating: 1500
RD: 350
Volatility: 0.06
If you face an average opponent of rating 1378 and RD 99 (Source) only once every rating period (1 month) for the next 12 periods (1 year) you will have accumulated an assumed National Class A (1800-1999) rating of 1852 when in reality you have only beat 12 average rated players over a span of 12 months.
Month Rating RD Volatility Class
1 1625 259 0.059999 National Class B
2 1682 225 0.059998 〃
3 1718 205 0.059997 〃
6 1784 174 0.059994 〃
12 1852 148 0.059988 National Class A
24 1922 127 0.059976 〃
If you face 2 average opponents every rating period, you can get to National Class A about 4-5 months, facing only 8-10 average opponents.
Month Rating RD Volatility Class
1 1672 215 0.059999 National Class B
2 1733 183 0.059997 〃
3 1770 166 0.059995 〃
4 1797 154 0.059993 〃
5 1819 146 0.059992 National Class A
6 1836 140 0.059991 〃
Are these assumptions accurate? Is there a bug in my calculator?
If it is not a bug, what are some ways of countering this besides:
- Consider "true rating" to be lower bound of the deviation (Rating - RD)
- Do not show inactive user's rating
- Do not show users with less than N games
- As you are not asking an actual programming question, this would be better at math.stackexchange. – BlueRaja - Danny Pflughoeft Commented Aug 21, 2012 at 14:06
- It's possible that this is really a bug. – ParoX Commented Aug 21, 2012 at 15:09
- In which case, you could give us the expected oute, and we might be able to help track down the bug. But determining if it really is a bug still involves no programming, only math, and thus is a better fit for that site. – BlueRaja - Danny Pflughoeft Commented Aug 21, 2012 at 15:10
1 Answer
Reset to default 9It may seem counter-intuitive but this is actually a correct result. If you continuously play average players, but you always win, regardless of the time periods, you're demonstrating you have a high ranking (not an average ranking even though your opponents are average). A player who is average (has a 'true' average rank), playing opponents of exactly the same 'true' rank (average) should win and lose about 50% of the time. A player with a 'true' rank that is very high, will win a larger percentage of the time when playing average players which depends on just how far apart their ranks are, but lets say it's a high enough rank that they should win 90% of the time. That means for ever 10 games played against an average player, this highly ranked player should lose 1 of them.
What you've effectively modeled is a player that has a rank high enough to win every single game against an average player (more than 12 or 24 games without a loss) which means their score will continue to go up unbounded if they continue to win, because they've never lost. Their demonstrating an ability that (until a loss happens) should have a rank separation large enough to approach an expected win ratio of 100%.
本文标签: phpGlicko2 Rating System Bug or exploitStack Overflow
版权声明:本文标题:php - Glicko-2 Rating System: Bug or exploit? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744068518a2585457.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论