Wow - very exciting features!
Some questions regarding SLDB:
* What are "TrustedSkill" and "EstimatedSkill"?
* Which one is used for balancing?
* Where does the bot get his information from or what matches are taken into account?
* In !status: is the value in <-x> the the lobby-rank equvalent of the TS?
* Do chranks stay local to the autohost, or get propagated to SLDB?
* "inactivity penalties" and "particularities of RTS games" sound great!! I'm curious - can you shortly explain what it does?
A plugin interface for spads is awesome! Allowing more people to code on/for spads will make spring even more fun!
(Looking forward to balancing based on players music taste and game speed control relative to medium bpm )
Some questions regarding SPADS 0.11, SLDB and TrueSkill
Moderators: Moderators, Lobby Developers, SPADS AutoHost
Re: SPADS AutoHost beta release
Ouch, lots of questions, this will be my longest post
The estimated skill is the one used for balancing, because this is the one which adjusts quickly to players results. Also, the results of games balanced using estimated skill allow to quickly fix these estimated skills in case of estimation errors.
The trusted skill on the other hand is the one used to build leaderboards because it takes into account the uncertainty of the estimated skill, so it prevents lucky newbies to be in top 10 after a few matches for example.
What I called "particularities of RTS games" encompasses multiple problems that I found when I evaluated the results of the first rating engine runs. I don't think I will remember all, but for example default TrueSkill gives way too much importance to the death order of losing players in FFA/TeamFFA, which isn't that important in RTS games. A side effect of this is visible here for example on your site, where "tritio33" who lost a FFA just got +15.7 TrueSkill.
"EstimatedSkill" is what is called "mu" in TrueSkill papers: the most probable skill of the player based on his results only. "TrustedSkill" is what is called "conservative skill" in TrueSkill papers: mu - 3 * sigmadansan wrote:* What are "TrustedSkill" and "EstimatedSkill"?
* Which one is used for balancing?
The estimated skill is the one used for balancing, because this is the one which adjusts quickly to players results. Also, the results of games balanced using estimated skill allow to quickly fix these estimated skills in case of estimation errors.
The trusted skill on the other hand is the one used to build leaderboards because it takes into account the uncertainty of the estimated skill, so it prevents lucky newbies to be in top 10 after a few matches for example.
All SPADS autohosts automatically send the results of all games played on official lobby and longer than 2 minutes iirc. Then, at SLDB level, only games which aren't too uneven (3v1 will be filtered out for instance), and which aren't draw FFA or draw TeamFFA are rated.dansan wrote:* Where does the bot get his information from or what matches are taken into account?
The <+-x> modifier was indeed the delta between current lobby rank and actual skill-rank equivalent of the player's estimated skill. However, I realized these ranks are too different and all these <+-> pollute !status output more than they help understanding balance. So finally I changed this behavior in SPADS 0.11.2 by adding a dedicated "Skill" column. Also, that way the chrank and ip rank values are still visible in the Rank column of !status output (even if they aren't in use when skillMode is set to TrueSkill). Skill values shown in this new "Skill" column are rounded to the nearest skill-rank if the user's privacy mode is enabled (enabled by default for now). And in this case, a "~" is added to show this is a rounded skill. Anyone can disable his privacy mode by saying "!set privacyMode 0" to the bot "SLDB". Then, his real skills will appear in !status output.dansan wrote:* In !status: is the value in <-x> the the lobby-rank equvalent of the TS?
They stay local.dansan wrote:* Do chranks stay local to the autohost, or get propagated to SLDB?
Basically, the inactivity penalties are applied each month to all high-skill with low uncertainty players who almost didn't play in the month. These penalties are quite light however, because the goal is not to really penalize them but actualy modelize their lost of skill and the increase of their skill uncertainty. Players recovering from inactivity periods should be able to recover their original rating quite fast thanks to this uncertainty increase. This system also prevents old inactive players to stay in the top part of the leaderboards, while not playing anymore.dansan wrote:* "inactivity penalties" and "particularities of RTS games" sound great!! I'm curious - can you shortly explain what it does?
What I called "particularities of RTS games" encompasses multiple problems that I found when I evaluated the results of the first rating engine runs. I don't think I will remember all, but for example default TrueSkill gives way too much importance to the death order of losing players in FFA/TeamFFA, which isn't that important in RTS games. A side effect of this is visible here for example on your site, where "tritio33" who lost a FFA just got +15.7 TrueSkill.
I found the motivation to implement this plugin interface mainly thanks to vbs' efforts to customize SPADS code, and your efforts to make this integration easy at replay site side, so thanksdansan wrote:A plugin interface for spads is awesome! Allowing more people to code on/for spads will make spring even more fun!
Re: SPADS AutoHost beta release
Would not the use of "thrusted skill" be more suited for game balance ? At least for player with unknowed record in the SLDB ?"EstimatedSkill" is what is called "mu" in TrueSkill papers: the most probable skill of the player based on his results only. "TrustedSkill" is what is called "conservative skill" in TrueSkill papers: mu - 3 * sigma
Let me explain. For example, new players are considered to be half way between the pool of players, starting with mu of 25 now, which make them in some game the supposed 2nd asset of the team. By taking into account the "thrusted skill" instead of the average one, that problem would be solved.
Maybe you could add something like:
TS_t = trusted TS
TS_mu = average TS
tmax = 100 hours
t= ingametime
if (t >= Tmax){
TS_used_for_balance = (TS_t -TS_mu) / tmax * t + TS_mu ;
}
else if (t < Tmax){
TS_used_for_balance = TS_mu;
}
or whatever...
Re: SPADS AutoHost beta release
For unrated accounts (players who didn't play one single rated game yet), their lobby rank skill equivalent is taken into account to balance the first game (so a real newbie will be considered as having a trueskill of 10 for example iirc). Then for all following games, their estimated skill is used because anyway this skill adapts very fast. Also, using this estimated skill to balance will precisely help to adapt it even faster.albator wrote:Would not the use of "thrusted skill" be more suited for game balance ? At least for player with unknowed record in the SLDB ?
Re: SPADS AutoHost beta release
Thank you for the long explanations!
Re: SPADS AutoHost beta release
I've changed a bit the privacy mode behavior, there are now 3 privacy modes:
0: privacy disabled, your exact trueskill rating is shown to everyone in !status output
1: basic privacy enabled (default), your exact trueskill rating is only shown to privileged autohost users in !status output (other players only see a rough estimate)
2: full privacy enabled, only a rough estimate of your trueskill rating is shown to everyone in !status output
One can change his privacy mode by saying "!set privacyMode X" to the bot "SLDB" in Spring lobby server (replacing "X" by 0, 1 or 2).
Also, remember that SPADS uses the ratings corresponding to current game type (Duel, Team, FFA and TeamFFA). So for example if there are only 2 players in the battle, the skills shown in !status output are the Duel ones.
Current game type is printed in !status output, just after the Mod name.
0: privacy disabled, your exact trueskill rating is shown to everyone in !status output
1: basic privacy enabled (default), your exact trueskill rating is only shown to privileged autohost users in !status output (other players only see a rough estimate)
2: full privacy enabled, only a rough estimate of your trueskill rating is shown to everyone in !status output
One can change his privacy mode by saying "!set privacyMode X" to the bot "SLDB" in Spring lobby server (replacing "X" by 0, 1 or 2).
Also, remember that SPADS uses the ratings corresponding to current game type (Duel, Team, FFA and TeamFFA). So for example if there are only 2 players in the battle, the skills shown in !status output are the Duel ones.
Current game type is printed in !status output, just after the Mod name.