View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
---|---|---|---|---|---|---|---|---|---|
0003436 | Spring engine | General | public | 2013-02-02 21:01 | 2015-10-19 11:20 | ||||
Reporter | burp | ||||||||
Assigned To | hokomoko | ||||||||
Priority | normal | Severity | crash | Reproducibility | always | ||||
Status | resolved | Resolution | fixed | ||||||
Product Version | 100.0 | ||||||||
Target Version | 101.0 | Fixed in Version | |||||||
Summary | 0003436: Desync from different float printing on linux and windows | ||||||||
Description | When we play the map longcat32 (which has no special lua stuff) on zero-k, linux users and windows users are always split after a few seconds. Usually all linux users get the desync message because most players are on windows. This only happens on this map. | ||||||||
Steps To Reproduce | Play zero-k (could not try other mods yet) on longcat32 map with windows and linux clients (linux x86_64). | ||||||||
Tags | No tags attached. | ||||||||
Checked infolog.txt for Errors | |||||||||
Attached Files |
|
Notes | |
Kloot (developer) 2013-02-02 22:55 |
if possible, please test if this also happens using a 92.0 test build |
abma (administrator) 2013-04-03 03:23 |
no feedback, we assume its fixed. if not, please reopen |
hokomoko (developer) 2015-09-07 15:26 |
Apparently still occurs in 100.0 |
abma (administrator) 2015-09-09 20:39 |
infolog.txt? replay? |
burp (reporter) 2015-09-09 22:36 Last edited: 2015-09-09 22:40 |
Linux: 20150906_153744_LongCat32_100.infolog.txt (from replaying demo) 20150906_153744_LongCat32_100.sdf |
Rafal99 (reporter) 2015-09-09 22:42 |
Windows 7, 64-bit: 20150906_153744_LongCat32_100_Windows.sdf |
abma (administrator) 2015-09-10 00:12 Last edited: 2015-09-10 00:14 |
suspicious when run with spring-headless: [f=0000037] Calling Garbage Collector on excessive LuaUI memory usage: 102.5 MB [f=0000332] Calling Garbage Collector on excessive LuaUI memory usage: 104.8 MB [f=0000665] Calling Garbage Collector on excessive LuaUI memory usage: 105.1 MB [f=0000991] Calling Garbage Collector on excessive LuaUI memory usage: 105.3 MB |
abma (administrator) 2015-09-10 00:16 Last edited: 2015-09-10 00:16 |
couldn't find an issue with valgrind/signan enabled! |
hokomoko (developer) 2015-09-10 00:16 |
map itself doesn't have lua, I guess it's because it's huge? |
abma (administrator) 2015-09-10 00:19 Last edited: 2015-09-10 00:21 |
http://api.springfiles.com/?springname=LongCat32 no, contents are: maps/LongCat32.smf maps/.LongCat32.smd.swp maps/LongCat32.smd maps/LongCat32.smt map is 32x6 |
abma (administrator) 2015-09-17 11:42 |
the usual cases seems to be not the cause (no errors with valgrind/signan), so debugging with sync-debug builds are required: https://springrts.com/wiki/Debugging_sync_errors |
abma (administrator) 2015-09-23 20:48 Last edited: 2015-09-23 20:49 |
tested locally with linux64 as server and spring-headless on windows. instant desyncs when giving commander a mex build order. used 100.0.1-207-g313b5e5 syncdebug-server.log: Server: 0000004 SpringApp::Run() [/home/buildbot/slave/linux-static-x64/build/build/syncdebug/../../rts/System/SpringApp.cpp:978] Server: === Backtrace 18 === Server: #0 void Sync::AssertDebugger<unsigned int>(unsigned int const&, char const*) [/home/buildbot/slave/linux-static-x64/build/build/syncdebug/../../rts/System/Sync/SyncedPrimitiveBase.h:31] Server: #1 CGame::ClientReadNet() [/home/buildbot/slave/linux-static-x64/build/build/syncdebug/../../rts/Net/NetCommands.cpp:511] Server: #2 CGame::Update() [/home/buildbot/slave/linux-static-x64/build/build/syncdebug/../../rts/Game/Game.cpp:1005] Server: 0000003 SpringApp::Update() [/home/buildbot/slave/linux-static-x64/build/build/syncdebug/../../rts/System/SpringApp.cpp:942] Server: 0000004 SpringApp::Run() [/home/buildbot/slave/linux-static-x64/build/build/syncdebug/../../rts/System/SpringApp.cpp:978] Server: Done! |
abma (administrator) 2015-09-23 22:09 |
an other try with MAX_STACK=10: Server: 0x42E52C94/ 1.14587067e+02 instead of 0x44CA0000/ 1.61600000e+03, frame 000381, backtrace 1 in "copyfloat" Server: === Backtrace 1 === Server: #0 Sync::Assert(void const*, unsigned int, char const*) [/home/abma/dev/spring/develop/rts/System/Sync/SyncedPrimitiveBase.h:45] Server: #1 void Sync::Assert<float>(float const&, char const*) [/home/abma/dev/spring/develop/rts/System/Sync/SyncedPrimitiveBase.h:60] Server: #2 SyncedPrimitive<float>::Sync(char const*) [/home/abma/dev/spring/develop/rts/System/Sync/SyncedPrimitive.h:43] Server: 0000003 SyncedPrimitive<float>::SyncedPrimitive(float) [/home/abma/dev/spring/develop/rts/System/Sync/SyncedPrimitive.h:84] Server: 0000004 SyncedFloat3::SyncedFloat3(float3 const&) [/home/abma/dev/spring/develop/rts/System/Sync/SyncedFloat3.h:43] Server: 0000005 CGroundMoveType::GetNewPath() [/home/abma/dev/spring/develop/rts/Sim/MoveTypes/GroundMoveType.cpp:1247] Server: #6 CGroundMoveType::StartEngine(bool) [/home/abma/dev/spring/develop/rts/Sim/MoveTypes/GroundMoveType.cpp:1446] Server: #7 CGroundMoveType::ReRequestPath(bool) [/home/abma/dev/spring/develop/rts/Sim/MoveTypes/GroundMoveType.cpp:1262] Server: #8 CGroundMoveType::StartMoving(float3, float) [/home/abma/dev/spring/develop/rts/Sim/MoveTypes/GroundMoveType.cpp:430] Server: #9 CMobileCAI::SetGoal(float3 const&, float3 const&, float) [/home/abma/dev/spring/develop/rts/Sim/Units/CommandAI/MobileCAI.cpp:898] |
abma (administrator) 2015-09-23 22:35 Last edited: 2015-09-23 22:36 |
can't reproduce in BA, very likely related to the zk lua code which sets mex positions on this map. not all mex positions cause the desync happen! @zkdevs: can you print all mex positions generated by the gadget (?) on this map with windows and linux please? |
Rafal99 (reporter) 2015-09-26 18:33 |
I added a widget (dbg_print_metal_spots.lua) that prints position and metal value of all metal spots generated by ZK gadget, and also prints calculated income of each created mex. The output of the widget on my system (Win 7 x64) can be seen in widget_output_win7.txt |
Rafal99 (reporter) 2015-09-27 02:08 Last edited: 2015-09-27 02:44 |
Seems Lamer added the output for linux64. Mex incomes differ starting from the third one, and metal spots are different in the array and it is more than just their order. This means the issue is in ZK metal spot finder, but the question is how could the Lua code produce different results for Windows players and Linux players without giving different results for everyone? |
Google_Frog (reporter) 2015-09-27 03:29 |
My suggestion is to look for a bug in Spring.GetGroundInfo. |
hokomoko (developer) 2015-09-27 12:14 Last edited: 2015-09-27 12:14 |
Also check the gadget for any calculations involving NaNs, division by 0 and stuff with math.huge |
abma (administrator) 2015-09-27 23:33 Last edited: 2015-09-27 23:40 |
@hokomoko:https://springrts.com/mantis/view.php?id=3436#c15132 @others: https://github.com/ZeroK-RTS/Zero-K/blob/master/LuaRules/Gadgets/mex_spot_finder.lua#L386 https://springrts.com/wiki/Debugging_sync_errors : "If you're a game developer, please be aware that Lua may be a source of desyncs. E.g. table iteration using pairs when you have tables, coroutines, or functions as keys is not a sync-safe operation, see mantis 0001050 for example of such a desync." uniqueGroups is a table! https://github.com/ZeroK-RTS/Zero-K/blob/master/LuaRules/Gadgets/mex_spot_finder.lua#L290 |
abma (administrator) 2015-09-28 14:51 Last edited: 2015-09-28 18:12 |
for the reference / further discussion: https://springrts.com/phpbb/viewtopic.php?f=23&t=33906 upstream bug report: https://github.com/ZeroK-RTS/Zero-K/issues/1069 |
hokomoko (developer) 2015-10-08 17:24 |
Apparently this is not due to pairs. https://github.com/spring/spring/blob/develop/rts/lib/lua/include/luaconf.h#L532 The conversion from a number to a string representation is based on sprintf which can produce different strings in windows and linux for the same float value (e+14 vs. e+014). |
hokomoko (developer) 2015-10-08 19:39 |
Fix 2e57d0d3c0c55abbd6ec71b6fe8b2fc0b3f1fbff committed to develop branch: Added lexical_cast for number->string conversions Fix 0003436, repo: spring changeset id: 5684 |
hokomoko (developer) 2015-10-15 15:27 |
While the issue of e+014 vs. e+14 can be solved, I'm afraid other rounding issues may be a problem. I'm starting to think this isn't safely fixable and float->string conversion should be warned against in documentation |
Issue History | |||
Date Modified | Username | Field | Change |
---|---|---|---|
2013-02-02 21:01 | burp | New Issue | |
2013-02-02 22:55 | Kloot | Note Added: 0009691 | |
2013-02-02 22:55 | Kloot | Assigned To | => abma |
2013-02-02 22:55 | Kloot | Status | new => feedback |
2013-02-02 22:55 | Kloot | Assigned To | abma => |
2013-04-03 03:23 | abma | Note Added: 0010374 | |
2013-04-03 03:23 | abma | Status | feedback => resolved |
2013-04-03 03:23 | abma | Resolution | open => fixed |
2013-04-03 03:23 | abma | Assigned To | => abma |
2015-09-07 15:26 | hokomoko | Assigned To | abma => |
2015-09-07 15:26 | hokomoko | Note Added: 0015124 | |
2015-09-07 15:26 | hokomoko | Status | resolved => feedback |
2015-09-07 15:26 | hokomoko | Resolution | fixed => reopened |
2015-09-07 15:26 | hokomoko | Assigned To | => hokomoko |
2015-09-07 15:26 | hokomoko | Status | feedback => new |
2015-09-07 15:27 | hokomoko | Severity | block => major |
2015-09-07 15:27 | hokomoko | Status | new => assigned |
2015-09-07 15:27 | hokomoko | Assigned To | hokomoko => |
2015-09-09 20:39 | abma | Note Added: 0015128 | |
2015-09-09 20:40 | abma | Status | assigned => feedback |
2015-09-09 22:34 | burp | File Added: 20150906_153744_LongCat32_100.infolog.txt | |
2015-09-09 22:35 | burp | File Added: 20150906_153744_LongCat32_100.sdf | |
2015-09-09 22:36 | burp | Note Added: 0015129 | |
2015-09-09 22:36 | burp | Status | feedback => new |
2015-09-09 22:40 | burp | Note Edited: 0015129 | View Revisions |
2015-09-09 22:40 | burp | Note Edited: 0015129 | View Revisions |
2015-09-09 22:41 | Rafal99 | File Added: 20150906_153744_LongCat32_100_Windows.sdf | |
2015-09-09 22:42 | Rafal99 | Note Added: 0015130 | |
2015-09-09 23:16 | abma | Severity | major => crash |
2015-09-09 23:16 | abma | Product Version | 91.0 => 100.0 |
2015-09-09 23:16 | abma | Target Version | => 101.0 |
2015-09-09 23:35 | abma | Priority | high => normal |
2015-09-10 00:12 | abma | Note Added: 0015131 | |
2015-09-10 00:14 | abma | Note Edited: 0015131 | View Revisions |
2015-09-10 00:16 | abma | Note Added: 0015132 | |
2015-09-10 00:16 | abma | Note Edited: 0015132 | View Revisions |
2015-09-10 00:16 | hokomoko | Note Added: 0015133 | |
2015-09-10 00:19 | abma | Note Added: 0015134 | |
2015-09-10 00:21 | abma | Note Edited: 0015134 | View Revisions |
2015-09-17 11:42 | abma | Note Added: 0015189 | |
2015-09-17 11:46 | abma | Summary | Desync between linux and windows players => Desync between linux and windows players on LongCat32 |
2015-09-23 20:48 | abma | Note Added: 0015214 | |
2015-09-23 20:49 | abma | Note Edited: 0015214 | View Revisions |
2015-09-23 20:52 | abma | File Added: trace1-client.log | |
2015-09-23 20:52 | abma | File Added: trace0.log | |
2015-09-23 20:53 | abma | File Added: 20150923_204703_LongCat32_100.0.1-207-g313b5e5 develop.sdf | |
2015-09-23 20:53 | abma | File Added: syncdebug-server.log | |
2015-09-23 22:09 | abma | Note Added: 0015215 | |
2015-09-23 22:35 | abma | Note Added: 0015216 | |
2015-09-23 22:36 | abma | Note Edited: 0015216 | View Revisions |
2015-09-23 22:45 | abma | Summary | Desync between linux and windows players on LongCat32 => Desync between linux and windows players on zk/LongCat32 when placing a mex on a specific spot |
2015-09-24 23:35 | abma | Status | new => feedback |
2015-09-26 18:33 | Rafal99 | Note Added: 0015247 | |
2015-09-26 18:34 | Rafal99 | File Added: dbg_print_metal_spots.lua | |
2015-09-26 18:34 | Rafal99 | File Added: widget_output_win7.txt | |
2015-09-27 01:31 | lamer | File Added: widget_output_linux64.txt | |
2015-09-27 02:08 | Rafal99 | Note Added: 0015248 | |
2015-09-27 02:17 | Rafal99 | Note Edited: 0015248 | View Revisions |
2015-09-27 02:34 | Rafal99 | Note Edited: 0015248 | View Revisions |
2015-09-27 02:35 | Rafal99 | Note Edited: 0015248 | View Revisions |
2015-09-27 02:42 | Rafal99 | Note Edited: 0015248 | View Revisions |
2015-09-27 02:44 | Rafal99 | Note Edited: 0015248 | View Revisions |
2015-09-27 03:29 | Google_Frog | Note Added: 0015249 | |
2015-09-27 12:14 | hokomoko | Note Added: 0015250 | |
2015-09-27 12:14 | hokomoko | Note Edited: 0015250 | View Revisions |
2015-09-27 23:33 | abma | Note Added: 0015251 | |
2015-09-27 23:33 | abma | Status | feedback => resolved |
2015-09-27 23:33 | abma | Resolution | reopened => no change required |
2015-09-27 23:33 | abma | Assigned To | => abma |
2015-09-27 23:34 | abma | Note Edited: 0015251 | View Revisions |
2015-09-27 23:40 | abma | Note Edited: 0015251 | View Revisions |
2015-09-28 01:28 | abma | Relationship added | related to 0001050 |
2015-09-28 14:51 | abma | Note Added: 0015252 | |
2015-09-28 18:12 | abma | Note Edited: 0015252 | View Revisions |
2015-10-08 17:24 | hokomoko | Assigned To | abma => hokomoko |
2015-10-08 17:24 | hokomoko | Note Added: 0015286 | |
2015-10-08 17:24 | hokomoko | Status | resolved => feedback |
2015-10-08 17:24 | hokomoko | Resolution | no change required => reopened |
2015-10-08 17:24 | hokomoko | Status | feedback => assigned |
2015-10-08 19:39 | hokomoko | Changeset attached | => spring develop 2e57d0d3 |
2015-10-08 19:39 | hokomoko | Note Added: 0015288 | |
2015-10-08 19:39 | hokomoko | Status | assigned => resolved |
2015-10-15 15:27 | hokomoko | Note Added: 0015302 | |
2015-10-15 15:27 | hokomoko | Status | resolved => feedback |
2015-10-16 18:50 | hokomoko | Changeset attached | => spring develop 6b27c464 |
2015-10-16 18:51 | hokomoko | Summary | Desync between linux and windows players on zk/LongCat32 when placing a mex on a specific spot => Desync from different float printing on linux and windows |
2015-10-19 11:20 | hokomoko | Status | feedback => resolved |
2015-10-19 11:20 | hokomoko | Resolution | reopened => fixed |