View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
---|---|---|---|---|---|---|---|---|---|
0005955 | Spring engine | General | public | 2018-04-09 23:20 | 2018-05-12 17:44 | ||||
Reporter | Floris | ||||||||
Assigned To | Kloot | ||||||||
Priority | normal | Severity | minor | Reproducibility | have not tried | ||||
Status | resolved | Resolution | fixed | ||||||
Product Version | 104.0 +git | ||||||||
Target Version | Fixed in Version | ||||||||
Summary | 0005955: desyncing | ||||||||
Description | maintenance ...-358 desyncs http://replays.springrts.com/replay/52cbcb5a084e4c31da54045cbafd7536/ | ||||||||
Tags | No tags attached. | ||||||||
Checked infolog.txt for Errors | |||||||||
Attached Files |
|
![]() |
|
Floris (reporter) 2018-04-09 23:55 |
[23:53:59] <[President]Trump> i saw replay of desync game [23:54:11] <[PinK]triton> ok? [23:54:23] <[PinK]triton> flow said he reported issue already [23:54:30] <[PinK]triton> you think you found the reason? [23:54:34] <[President]Trump> noticed my nukes and screaners didnt stockpile in replay, while they did in the game [23:55:01] <[PinK]triton> I can share this information to flow [23:55:05] <[PinK]triton> but nothing else much [23:55:11] <[President]Trump> other player's nukes and mercury did stockpile [23:55:16] <[PinK]triton> ok |
Kloot (developer) 2018-04-10 20:38 Last edited: 2018-04-10 20:40 |
running the demo "only" turned up a random memory corruption bug (also present in 151-g11de57d), nothing related to weapon stockpiling which is ancient code. since ZK has so far been desync-free using 358-gb58f0b, it would help to know if this can be reproduced with 378-g4eeb848. |
Kloot (developer) 2018-04-16 17:33 |
I can not find anything suspect in 378+. NB if you're not aware of it already: the "restore dead units" (?) trick you used as a spectator in another DSD game a few days ago does desync demos. |
Floris (reporter) 2018-04-17 22:43 Last edited: 2018-04-17 22:51 |
at ~ 28 mins in BanDolf starts desyncing on maintenance -409 http://replays.springrts.com/replay/3053d65ab14f41ee0185b0f9660f129a/ |
Floris (reporter) 2018-04-18 00:30 |
and another game full of desyncs: http://replays.springrts.com/replay/b068d65af6de2b2d64c3274c74181d11/ |
Floris (reporter) 2018-04-19 12:24 |
recorded moment of BanDolf desyncing: https://www.youtube.com/watch?v=RMYooZ8dIXs how can we help? |
Kloot (developer) 2018-04-19 12:53 |
afaics in both 3053d65ab14f41ee0185b0f9660f129a and b068d65af6de2b2d64c3274c74181d11 only one player diverged while the rest kept a consistent state (BanDolf even came back into sync before going out of it again), which suggests an unsynced origin and may be harder to reproduce. right now I have no idea about the cause, addrsan and signan builds show no errors and local checksum when replaying is always (tested 10 runs) constant. the best way to help would be to get as many people trying as many "unusual" things as early as possible (plus whatever BanDolf was doing) to narrow down what triggers it. assuming the source was introduced between 151 and 358 another option is to bisect, but that will take more time. |
Floris (reporter) 2018-04-19 12:55 |
we have desyncs on 151 as well, ...maybe lightly less often though |
Doo (reporter) 2018-04-19 13:21 |
As far as I can remember, we've had some (a few) desyncs on spring 103 aswell. The information I can share is very thin, but i guess it's still worth notifying. In one of my desyncs, i spotted that a gadget had completly stopped working: unit_stomp.lua -- It prevents krogoth stomp weapon (that fires on each footsteps, causing damages all around the foot) to damage units beside peewees/aks/scouts The issue trump mentionned seems to be related to unit_mercscr_stockpile_limit.lua, which handles the stockpiling of screamers and mercuries (allows/disallows stockpiling to set a 5 stockpiled missiles limit) I can't tell if synced gadgets stopping working is just an effect of being unsynced, or if it means a synced gadget/LUS or if a global Lua instability is the actual cause of the desyncs. Is it possible a gadget or unit script causes desyncs for players ? What kind of code would be tricky / would possibly cause desyncs (as in, is there some things that I should just avoid playing with)? That would help us check our gadgetry again to make sure nothing sticks out here. Is the use of math.random in synced code safe? One of the unit scripts uses this: for count, piece in pairs(piecetable) do randomnumber = math.random(1,2) [...] end Is this safe? Or is there a chance that the math.random() result for piece[count] differs from one player to another? |
Kloot (developer) 2018-04-19 13:32 Last edited: 2018-04-19 13:37 |
@Floris I only knew of the "Lua OOM while catching up" desync in 151, this is news to me. "Is it possible a gadget or unit script causes desyncs for players ?" yes, for example if the game archive is corrupted (which happened more than once while ZK was using sdp) a gadget can crash or fail to load on one machine. another common mistake is to use tables as table keys in synced Lua, which will cause iteration order to diverge. "Is the use of math.random in synced code safe?" yes. "In one of my desyncs, i spotted that a gadget had completly stopped working: unit_stomp.lua" did your infolog mention anything special about that gadget? |
Doo (reporter) 2018-04-19 13:36 |
Balanced Annihilation is usually downloaded through SpringLobby as sdp, and rarely as an sd7 from springfiles or whatever direct download link. Especially for the test versions. Should I consider this as a possible source of this issue? (But then i'd ask why now, why not when we were playing BA 9.46 on spring 103?) |
Kloot (developer) 2018-04-19 13:51 Last edited: 2018-04-19 14:05 |
"Balanced Annihilation is usually downloaded through SpringLobby as sdp ... Should I consider this as a possible source of this issue?" sdp download corruption issues were common with ZK last year (just search mantis), so I would strongly recommend staying away from pool archives. "But then i'd ask why now, why not when we were playing BA 9.46 on spring 103?" I don't have statistics, but regular hosting of test versions distributed via sdp seems to be more popular now. it's also possible (but speculation) the downloader implementation used by springlobby was broken after 103, or engine filesystem changes might have snuck in a bug. |
Google_Frog (reporter) 2018-04-19 17:41 |
Zero-K is not necessarily desync-free on 358-gb58f0b. See these reports https://github.com/ZeroK-RTS/CrashReports/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+desync. I have not been paying too much attention to the desync logs recently. I recall glancing at each as it appeared and deciding either that it was some other issue (such as a bad download) or that there was little to do at the time. |
Kloot (developer) 2018-04-22 17:45 |
all of the 358 ZK desyncs can be traced to Lua OOM's or broken content. there might of course still be a "genuine" case lurking in engine code (especially after the recent refactors), but until the BA community rules out pool corruption by switching to sdz for tests this will go onto the UTR pile. |
Floris (reporter) 2018-04-26 21:29 |
At first the spec RPaulson desyncs for a while and then (much) later player spike_spb desyncs later (together with some others) My 2nd pc desynced shortly as well (Floris) it has manually downloaded and installed the BA archive. http://replays.springrts.com/replay/7211e25a88e25d9942c23e368a21b655/ will attach spike_spb's replay |
Floris (reporter) 2018-04-26 21:34 |
added spike_spb' s infolog as well |
Kloot (developer) 2018-04-27 18:42 |
RPaulson didn't desync after rejoining, so it's a non-deterministic bug with an unknown (but small) probability of being triggered. unfortunately 7211e25a88e25d9942c23e368a21b655 didn't reveal anything new yet. |
Kloot (developer) 2018-05-12 17:44 |
nuked |
![]() |
|||
Date Modified | Username | Field | Change |
---|---|---|---|
2018-04-09 23:20 | Floris | New Issue | |
2018-04-09 23:20 | Floris | File Added: infolog.txt | |
2018-04-09 23:55 | Floris | Note Added: 0018999 | |
2018-04-10 20:38 | Kloot | Note Added: 0019000 | |
2018-04-10 20:40 | Kloot | Note Edited: 0019000 | View Revisions |
2018-04-16 17:33 | Kloot | Status | new => closed |
2018-04-16 17:33 | Kloot | Resolution | open => unable to reproduce |
2018-04-16 17:33 | Kloot | Note Added: 0019031 | |
2018-04-17 22:43 | Floris | Status | closed => feedback |
2018-04-17 22:43 | Floris | Resolution | unable to reproduce => reopened |
2018-04-17 22:43 | Floris | Note Added: 0019032 | |
2018-04-17 22:51 | abma | Note Edited: 0019032 | View Revisions |
2018-04-18 00:30 | Floris | Note Added: 0019033 | |
2018-04-18 00:30 | Floris | Status | feedback => new |
2018-04-19 12:24 | Floris | Note Added: 0019035 | |
2018-04-19 12:53 | Kloot | Note Added: 0019036 | |
2018-04-19 12:55 | Floris | Note Added: 0019037 | |
2018-04-19 13:21 | Doo | Note Added: 0019038 | |
2018-04-19 13:32 | Kloot | Note Added: 0019039 | |
2018-04-19 13:33 | Kloot | Note Edited: 0019039 | View Revisions |
2018-04-19 13:36 | Doo | Note Added: 0019040 | |
2018-04-19 13:37 | Kloot | Note Edited: 0019039 | View Revisions |
2018-04-19 13:51 | Kloot | Note Added: 0019041 | |
2018-04-19 14:00 | Kloot | Note Edited: 0019041 | View Revisions |
2018-04-19 14:05 | Kloot | Note Edited: 0019041 | View Revisions |
2018-04-19 17:41 | Google_Frog | Note Added: 0019042 | |
2018-04-22 17:45 | Kloot | Status | new => closed |
2018-04-22 17:45 | Kloot | Note Added: 0019046 | |
2018-04-26 21:29 | Floris | Status | closed => feedback |
2018-04-26 21:29 | Floris | Note Added: 0019052 | |
2018-04-26 21:29 | Floris | File Added: 20180426_205048_DeltaSiegePrime_Ultimate_104.0.1-413-gd902a7b_maintenance.sdfz | |
2018-04-26 21:34 | Floris | File Added: infolog_spike_spb.txt | |
2018-04-26 21:34 | Floris | Note Added: 0019053 | |
2018-04-26 21:34 | Floris | Status | feedback => new |
2018-04-27 18:42 | Kloot | Note Added: 0019054 | |
2018-05-12 15:53 | Kloot | Assigned To | => Kloot |
2018-05-12 15:53 | Kloot | Status | new => assigned |
2018-05-12 17:44 | Kloot | Status | assigned => resolved |
2018-05-12 17:44 | Kloot | Resolution | reopened => fixed |
2018-05-12 17:44 | Kloot | Note Added: 0019105 |