Hello Uri,http://www.husvankempen.de/nunn/aegtrating.htm
I see that Ufim that was last place in the queen class have better rating than all the programs of the rook class.
It is absurd copnsidering the fact that the queen class and the rook class are almost in the same level.
Movei has rating of 2630 when Pharaon 2.62 has only 2460
Another absurd considering the fact that old Pharaon is playing in the premier division of Leo when movei is not there.
I do not know how the rating was claculated but it seems to me that difference of near 200 elo between the average of the classes was assumed and it is better to assume difference of 100 elo between the average of the classes that is more realistic and the difference between the queen class and the rook class is less than 100 elo.
Here are some rating from WBEC
King class:
Aristarch 2658
Ruffian 2713
Delfi 2559
List ----
Thinker 2583
Crafty 2648
Smarthink 2612
Tao 2602
Quark 2555
Anmon 2441
Gothmog 2518
Yace 2530
avverage almost 2600
Queen class:
GreenLight 2554
Elchinito ----
Wildcat 2528
Fruit ----
Amyan 2555
Slowchess 2451
Movei 2538
Jonny ----
Amy 2429
Dragon 2503
Kingofkings2492
Ufim 2313
average almost 2500
rook class:
Baron 2425
Pepito 2494
Naum ----
Pharaon 2527
KnightDreamer2411
Comet 2472
Arasan 2386
Terra 2347
Postmodernist2404
Frenzee 2310
Amateur 2425
Crazybishop 2423
average>2400
bishop class:
DanChess 2444
spike 2455
snitch 2428
Bruja ----
Cerebro 2324
Boot 2468
Trace 2198
Djinn 2358
Harmann 2243
Alarm 2270
BlackBishop 2208
BigLion 2142
average>2300
Did Elostat use all the games?Hi all ,
the Bishop Class qualifier between Muse 0.998 and Averno 0.70 ended 6:3 in favour of Muse. So Muse will play Bishop Class and Averno in Knight Class.
Games were 40/40 adapted to 2 Ghz and are included in the updated rating list, as well as the Queen/Rook Class gauntlet with Abrok 5.0.
With every double round robin in AEGT 2 the rating list will be more precise.
http://www.husvankempen.de/nunn/
Best Regards
Heinz
Hello all,Did Elostat use all the games?Hi all ,
the Bishop Class qualifier between Muse 0.998 and Averno 0.70 ended 6:3 in favour of Muse. So Muse will play Bishop Class and Averno in Knight Class.
Games were 40/40 adapted to 2 Ghz and are included in the updated rating list, as well as the Queen/Rook Class gauntlet with Abrok 5.0.
With every double round robin in AEGT 2 the rating list will be more precise.
http://www.husvankempen.de/nunn/
Best Regards
Heinz
I do not see big difference between the rook and the queen class
Abrok scored 13 against every class
Patzer:
14.5 against the rook class
15 against the queen class
Nejmet
12.5 against the rook class
8.5 against the queen class
Betsy
6.5 against the rook
4 against the queen
Total result against queen
13+15+8.5+4=40.5/96
Total result against rook
13+14.5+12.5+6.5=46.5/96
The difference between the classes that is suggested by the results is clearly less than 100 elo.
I do not understand how elostat can get different result and it is better never to use that stupid program.
I think that it is better even not to have rating list and not to support that stupid elo program by publishing results that program got by this program.
Uri
Hello all,Did Elostat use all the games?Hi all ,
the Bishop Class qualifier between Muse 0.998 and Averno 0.70 ended 6:3 in favour of Muse. So Muse will play Bishop Class and Averno in Knight Class.
Games were 40/40 adapted to 2 Ghz and are included in the updated rating list, as well as the Queen/Rook Class gauntlet with Abrok 5.0.
With every double round robin in AEGT 2 the rating list will be more precise.
http://www.husvankempen.de/nunn/
Best Regards
Heinz
I do not see big difference between the rook and the queen class
Abrok scored 13 against every class
Patzer:
14.5 against the rook class
15 against the queen class
Nejmet
12.5 against the rook class
8.5 against the queen class
Betsy
6.5 against the rook
4 against the queen
Total result against queen
13+15+8.5+4=40.5/96
Total result against rook
13+14.5+12.5+6.5=46.5/96
The difference between the classes that is suggested by the results is clearly less than 100 elo.
I do not understand how elostat can get different result and it is better never to use that stupid program.
I think that it is better even not to have rating list and not to support that stupid elo program by publishing results that program got by this program.
Uri
I do not think that EloStat is a stupid program although there are weaknesses. A program can only deliver good results for rating when the data allows connections. We had four different classes in AEGT 1.
Hello Uri,I know there are not enough games but the error that I find is not a statistical error.
Based on the results the rating should be different.
I could easily write the mathematical part of calculating rating but the main problem for me is to write a program that simply get the results from pgn file.
If somebody has a program in C that can read pgn file and get the results from it in some array then I guess that I can continue it to a better program that analyze the results.
I also need some small function to calculate the expected result between programs based on the difference in rating and after it the task seems to be easy.
Uri
I can add that the arrays that I want can assume maximal number of 1000 programs when I need only (number of programs not more than 1000)Hello all,Did Elostat use all the games?Hi all ,
the Bishop Class qualifier between Muse 0.998 and Averno 0.70 ended 6:3 in favour of Muse. So Muse will play Bishop Class and Averno in Knight Class.
Games were 40/40 adapted to 2 Ghz and are included in the updated rating list, as well as the Queen/Rook Class gauntlet with Abrok 5.0.
With every double round robin in AEGT 2 the rating list will be more precise.
http://www.husvankempen.de/nunn/
Best Regards
Heinz
I do not see big difference between the rook and the queen class
Abrok scored 13 against every class
Patzer:
14.5 against the rook class
15 against the queen class
Nejmet
12.5 against the rook class
8.5 against the queen class
Betsy
6.5 against the rook
4 against the queen
Total result against queen
13+15+8.5+4=40.5/96
Total result against rook
13+14.5+12.5+6.5=46.5/96
The difference between the classes that is suggested by the results is clearly less than 100 elo.
I do not understand how elostat can get different result and it is better never to use that stupid program.
I think that it is better even not to have rating list and not to support that stupid elo program by publishing results that program got by this program.
Uri
I do not think that EloStat is a stupid program although there are weaknesses. A program can only deliver good results for rating when the data allows connections. We had four different classes in AEGT 1.
I know there are not enough games but the error that I find is not a statistical error.
Based on the results the rating should be different.
I could easily write the mathematical part of calculating rating but the main problem for me is to write a program that simply get the results from pgn file.
If somebody has a program in C that can read pgn file and get the results from it in some array then I guess that I can continue it to a better program that analyze the results.
I also need some small function to calculate the expected result between programs based on the difference in rating and after it the task seems to be easy.
Uri
I can add that 'connections' between pools are of course more important,Hello Uri,I know there are not enough games but the error that I find is not a statistical error.
Based on the results the rating should be different.
I could easily write the mathematical part of calculating rating but the main problem for me is to write a program that simply get the results from pgn file.
If somebody has a program in C that can read pgn file and get the results from it in some array then I guess that I can continue it to a better program that analyze the results.
I also need some small function to calculate the expected result between programs based on the difference in rating and after it the task seems to be easy.
Uri
I would like it when you can do that in some way and give us better values. Of course I am also not happy with the rating list so far, but based on my experiences with 70 000 games and more for rating calculation like for my Nunn Blitz rating list, it will be better with more games and connections.
Anyway it is of course not fair to compare this first small rating list with WBEC, because Leo is running his tournaments for years already with a lot of engine versions not changing over the past years and with promotion and demotion. I know that Leo is doing rating calculation in a different way. Would be interesting to know more about that.
Best Regards
Heinz
Of course connection between pools is important but no connection between pools can be detected by the program and a program should not give one rating list when there is no connection between pools but more than one rating list.I can add that 'connections' between pools are of course more important,Hello Uri,I know there are not enough games but the error that I find is not a statistical error.
Based on the results the rating should be different.
I could easily write the mathematical part of calculating rating but the main problem for me is to write a program that simply get the results from pgn file.
If somebody has a program in C that can read pgn file and get the results from it in some array then I guess that I can continue it to a better program that analyze the results.
I also need some small function to calculate the expected result between programs based on the difference in rating and after it the task seems to be easy.
Uri
I would like it when you can do that in some way and give us better values. Of course I am also not happy with the rating list so far, but based on my experiences with 70 000 games and more for rating calculation like for my Nunn Blitz rating list, it will be better with more games and connections.
Anyway it is of course not fair to compare this first small rating list with WBEC, because Leo is running his tournaments for years already with a lot of engine versions not changing over the past years and with promotion and demotion. I know that Leo is doing rating calculation in a different way. Would be interesting to know more about that.
Best Regards
Heinz
than a big number of games. If two pools have no connection its simply
unlogically to compare anything.
Well, EloStat warns you if there is no connection between the pool!Of course connection between pools is important but no connection between pools can be detected by the program and a program should not give one rating list when there is no connection between pools but more than one rating list.I can add that 'connections' between pools are of course more important,Hello Uri,I know there are not enough games but the error that I find is not a statistical error.
Based on the results the rating should be different.
I could easily write the mathematical part of calculating rating but the main problem for me is to write a program that simply get the results from pgn file.
If somebody has a program in C that can read pgn file and get the results from it in some array then I guess that I can continue it to a better program that analyze the results.
I also need some small function to calculate the expected result between programs based on the difference in rating and after it the task seems to be easy.
Uri
I would like it when you can do that in some way and give us better values. Of course I am also not happy with the rating list so far, but based on my experiences with 70 000 games and more for rating calculation like for my Nunn Blitz rating list, it will be better with more games and connections.
Anyway it is of course not fair to compare this first small rating list with WBEC, because Leo is running his tournaments for years already with a lot of engine versions not changing over the past years and with promotion and demotion. I know that Leo is doing rating calculation in a different way. Would be interesting to know more about that.
Best Regards
Heinz
than a big number of games. If two pools have no connection its simply
unlogically to compare anything.
There is a problem what to do when the connection between pools is weak.
An extreme case is the case when there is only a single game between programs in team A and programs in team B and there is no indirect connections between them when both played against programs in team C.
If the game is not drawn it is impossible to evaluate team A relative to team B
(if the result happens again and again the difference will be infinite) and even if the game is a draw then there is a big statistical error in the assumption that the programs that drew are equal.
Not giving rating to programs that won all the games and mark them as too good to get rating and not giving rating to programs that lost all the games and mark them as too bad is trivial but it is not enough and even after repeating the process there still may be team A that always beated team B when there is no indirect connection between the teams(not in AEGT but a program should be general to analyze results of games).
Uri
Return to Archive (Old Parsimony Forum)
Users browsing this forum: No registered users and 61 guests