Page 2 of 2

Re: Desync with table iterating using pairs() in synced code

Posted: 07 Oct 2015, 09:22
by jK
Oh I twisted the 512 with a result of a benchmark I did, the actual limit is 64 chars.
The responsible function is the following and has another issue:

Code: Select all

static inline lua_Hash calchash(const char *str, size_t l) {
  lua_Hash h = cast(unsigned int, l);  /* seed */
  size_t step = (l>>5)+1;  /* if string is too long, don't hash all its chars */
  size_t l1;
  for (l1=l; l1>=step; l1-=step) {  /* compute hash */
    h = h ^ ((h<<5)+(h>>2)+cast(unsigned char, str[l1-1]));
  }
  return h;
}
Till (including) 63 the iterator step is +1, with 64 it becomes +2. So with 63 all 63chars are hashed, with 64 only 32 chars are used!
I also added a time profiler and didn't noticed any massive time spend in it, so it seems to be a border case that lua wants to optimize here with their strange "only use every x'th char for hash".

abma wrote:1. don't use pairs
2. only use tables/pairs() with strings < 512 chars or numbers as keys (?)
3. change engine: make pairs() deterministic with string as keys
4. do nothing until you have desyncs
@1. no option
@3. not easy and likely no one wants to investigate the error in deep

@4. it's not that this desync happened often so far, so the question is if it really is worth it to spend x+ hours on fixing lua itself
@2. I can easily add an error for the 64 chars limit, but this will delude the user that <64char strings would be safe, but hash collisions can happen there too.


My plausible `solutions` would be look like this:
  • Switch lua hash function to HsieHash that is used elsewhere in the engine
  • Write an UnitTest that checks how many hash collisions HsieHash has in <256chars range (with human readable chars of ASCII range, unicode would be overkill)
  • Based on the results of that UT, add an error when using pairs() for strings >Nchars

Re: Desync with table iterating using pairs() in synced code

Posted: 07 Oct 2015, 09:39
by Silentwings
If you don't mind doing the work of discovering how many chars is ~safe with HsieHash from a practical point of view, that sounds great. I guess doing just 2 would be sufficient in almost all cases, ofc can't be sure without the test.

Re: Desync with table iterating using pairs() in synced code

Posted: 07 Oct 2015, 11:33
by abma
side note:

in lstring.cpp, luaS_newlstr() is modificated by us.

https://github.com/spring/spring/blob/d ... g.cpp#L109
vs
https://github.com/lua/lua/blob/5.1.5/src/lstring.c#L75

very likely to make lua sync-safe but in this case something is wrong / missing.

Re: Desync with table iterating using pairs() in synced code

Posted: 07 Oct 2015, 11:50
by hokomoko
I couldn't find an actual change other than moving all hash calculations to a single place.

Re: Desync with table iterating using pairs() in synced code

Posted: 08 Oct 2015, 07:56
by smoth
so if we have desyncs then we get to play bug hunt?

Re: Desync with table iterating using pairs() in synced code

Posted: 08 Oct 2015, 10:26
by Silentwings
No, the bug has been found and probably won't affect you unless you play ZK on specific maps.

Re: Desync with table iterating using pairs() in synced code

Posted: 08 Oct 2015, 10:34
by Anarchid
A few maps (which use automated featuremex placement) are also infected with the metal detector gadget. So is Evo, which uses it in exactly the same way ZK does.

The synced effect of this shouldn't be affecting sync at a guess (no difference if almost completely non-interactive features are placed in different order?), but i won't guarantee that.

Re: Desync with table iterating using pairs() in synced code

Posted: 08 Oct 2015, 22:28
by abma
great news, the desync seems to be caused by number -> string conversation which is fixed:

https://github.com/spring/spring/commit ... c0b3f1fbff

so, pairs with long strings (should) be sync-safe!

thanks a lot for your help + feedback!

Re: Desync with table iterating using pairs() in synced code

Posted: 09 Oct 2015, 07:46
by gajop
Great this is resolved. I didn't like the idea of key-length compromises, although I do think people should choose smaller keys usually (it's probably faster/cleaner).

PS: Is this patch something that can/should be fixed upstream?

Re: Desync with table iterating using pairs() in synced code

Posted: 09 Oct 2015, 09:35
by hokomoko
PS: Is this patch something that can/should be fixed upstream?
no.