View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
---|---|---|---|---|---|---|---|---|---|
0003339 | Spring engine | General | public | 2012-11-23 09:50 | 2012-11-29 20:22 | ||||
Reporter | abma | ||||||||
Assigned To | abma | ||||||||
Priority | normal | Severity | crash | Reproducibility | always | ||||
Status | resolved | Resolution | fixed | ||||||
Product Version | 91.0.1+git | ||||||||
Target Version | Fixed in Version | ||||||||
Summary | 0003339: desync in QTPFS with validation client sharing cache when no cache exists | ||||||||
Description | [f=0018960] Sync error for ValidationClient in frame 18645 (got 68a860d7, correct is 42180536) http://buildbot.springrts.com/builders/validationtests/builds/1956/steps/validation%20test_2/logs/stdio | ||||||||
Tags | No tags attached. | ||||||||
Checked infolog.txt for Errors | |||||||||
Attached Files |
|
![]() |
||||||
|
![]() |
|
abma (administrator) 2012-11-23 09:52 |
valgrind run with RAI needed i guess... |
abma (administrator) 2012-11-27 14:24 Last edited: 2012-11-27 14:34 |
ok, grepped through the output: first desync i found was run 1556 the second is 1880 http://buildbot.springrts.com/builders/validationtests/builds/1556 https://github.com/spring/spring/commit/bf0cbf898b03219e8e9f8d730ab4eafae2710e3e http://buildbot.springrts.com/builders/validationtests/builds/1880 https://github.com/spring/spring/commit/c14af2e31c3322fbed43e266ce934bd1a929b21f it seems to not desync every time, so it doesn't have one of these commits... (just this or some commit before of it...) |
abma (administrator) 2012-11-27 14:32 |
added the grep created with: for i in $(ls |sort -h -r); do echo $i; ( if [ -n "$(echo $i | grep bz2)" ]; then bzcat $i; else cat $i; fi )| grep "Sync error"; done >/tmp/grep.lo |
abma (administrator) 2012-11-27 15:14 |
hm, 1556 is pre 91.0... so that was the already fixed desync i guess. since 1880 it seems every validation run produced a desync. |
abma (administrator) 2012-11-27 16:51 Last edited: 2012-11-27 16:52 |
the demo files from this run: http://buildbot.springrts.com/builders/validationtests/builds/1981/steps/validation%20test_2/logs/stdio http://abma.de/tmp/20121127_155054_Altair_Crossing-V1_91.0.1-489-g3266337_develop.sdf http://abma.de/tmp/20121127_155056_Altair_Crossing-V1_91.0.1-489-g3266337_develop.sdf |
abma (administrator) 2012-11-27 16:53 Last edited: 2012-11-27 17:05 |
hmm, qtpfs is used... thats one commit before 1556: https://github.com/spring/spring/commit/64f869da82cae7a7b2d1c21cf462de7fff35f0ff |
abma (administrator) 2012-11-27 18:13 |
ok, valgrinded 20121127_155054 .. no error found |
abma (administrator) 2012-11-27 23:06 |
fixed one cause: https://github.com/spring/spring/commit/7e3e5d05a8b1b3f73d4a792a6e713ce98c4746c9 seems there are still points left... |
abma (administrator) 2012-11-27 23:11 |
possible desync causes: std::vector & std::map needs custom compare both is used in QTPFS |
abma (administrator) 2012-11-28 00:04 Last edited: 2012-11-28 12:19 |
vectors / maps seems to be fine, but: boost::threads pulls cmath in... that could be the cause... QTPFS seems to be always multithreaded... possible cause, too |
abma (administrator) 2012-11-28 23:44 |
to reproduce desync: /cheat /give 100 armflea move units desync |
abma (administrator) 2012-11-28 23:49 Last edited: 2012-11-28 23:55 |
syncdebug leads to: Server: #0 Assert<float> [/var/tmp/home/dev/spring/develop/rts/System/Sync/SyncedPrimitiveBase.h:48] Server: #1 SyncedPrimitive<float>::Sync(char const*) [/var/tmp/home/dev/spring/develop/rts/System/Sync/SyncedPrimitive.h:45] Server: #2 SyncedPrimitive<float>::SyncedPrimitive(float) [/var/tmp/home/dev/spring/develop/rts/System/Sync/SyncedPrimitive.h:86] Server: 0000003 SyncedFloat3::SyncedFloat3(float3 const&) [/var/tmp/home/dev/spring/develop/rts/System/Sync/SyncedFloat3.h:43] Server: 0000004 CGroundMoveType::GetNextWayPoint() [/var/tmp/home/dev/spring/develop/rts/Sim/MoveTypes/GroundMoveType.cpp:1323 (discriminator 1)] https://github.com/spring/spring/blob/develop/rts/Sim/Path/QTPFS/PathManager.cpp#L924 |
abma (administrator) 2012-11-28 23:52 |
with the normal pathfinder 100 armfleas it doesn't desync, so... 99.9% QTPFS i would say... |
abma (administrator) 2012-11-29 02:12 |
hmm: cached & cached seems to sync (can't be 100% sure...) cached & uncached seems to desync: http://buildbot.springrts.com/builders/validationtests/builds/1995/steps/validation%20test_2/logs/stdio uncached & uncached seems to desync : http://buildbot.springrts.com/builders/validationtests/builds/1993/steps/validation%20test_2/logs/stdio |
abma (administrator) 2012-11-29 02:31 Last edited: 2012-11-29 02:32 |
ouch... i think i got it: cached true: [f=0000000] [PathManager] pfs-checksum: 4998f8a1, mem-footprint: 125MB vs cached false: [f=0000000] [PathManager] pfs-checksum: 4998f8a1, mem-footprint: 126MB seems to mostly desync. cached/cached seems to always sync (didn't see a single case in ~10 runs or so where it desyncs while other cached/uncached desynced in maybe 80%) as uncached/uncached desyncs too, the cache generating code seems to be broken. sooo, the MT-code there is broken for sure :-) |
abma (administrator) 2012-11-29 04:28 |
much more errors: uncached: [f=0000000] initialized node-layer 16 (6 MB, 1351 leafs, ratio 0.005154) cached: [f=0000000] initialized node-layer 16 (6 MB, 1 leafs, ratio 0.000004) eieiei: spring-headless: /home/buildbot/slave/full-linux/build/rts/Sim/Path/QTPFS/PathManager.cpp:528: void QTPFS::PathManager::Serialize(const string&): Assertion `nodeTrees[i]->IsLeaf()' failed. http://buildbot.springrts.com/builders/validationtests/builds/2001/steps/validation%20test_1/logs/stdio |
Kloot (developer) 2012-11-29 17:44 Last edited: 2012-11-29 17:45 |
sorry about this, attached patch should fix it. (in PathDefines.hpp, better bump QTPFS_CACHE_VERSION to 5 as well) |
abma (administrator) 2012-11-29 17:59 |
nothing to excuse, thanks for your patch(es)! :) thanks, applied: https://github.com/spring/spring/commit/fb72ada7c118ae4ee90e3123573a1b8db0f133ad the desync seems to be fixed, an assertion still fails: spring-headless: /home/buildbot/slave/full-linux/build/rts/Sim/Path/QTPFS/PathManager.cpp:535: void QTPFS::PathManager::Serialize(const string&): Assertion `nodeTrees[i]->IsLeaf()' failed. |
Kloot (developer) 2012-11-29 20:19 |
can't help much there (I assume it triggers because the validation client starts before the server has fully finished generating/writing the cache files) |
abma (administrator) 2012-11-29 20:22 |
yep, i think/thought thats the problem, too... |
![]() |
|||
Date Modified | Username | Field | Change |
---|---|---|---|
2012-11-23 09:50 | abma | New Issue | |
2012-11-23 09:51 | abma | Product Version | => 91.0.1+git |
2012-11-23 09:52 | abma | Note Added: 0009368 | |
2012-11-27 14:24 | abma | Note Added: 0009396 | |
2012-11-27 14:24 | abma | Note Edited: 0009396 | View Revisions |
2012-11-27 14:32 | abma | File Added: grep.log.gz | |
2012-11-27 14:32 | abma | Note Added: 0009397 | |
2012-11-27 14:34 | abma | Note Edited: 0009396 | View Revisions |
2012-11-27 15:14 | abma | Note Added: 0009398 | |
2012-11-27 16:51 | abma | Note Added: 0009399 | |
2012-11-27 16:52 | abma | Note Edited: 0009399 | View Revisions |
2012-11-27 16:53 | abma | Note Added: 0009400 | |
2012-11-27 17:05 | abma | Note Edited: 0009400 | View Revisions |
2012-11-27 18:13 | abma | Note Added: 0009401 | |
2012-11-27 22:49 | abma | Summary | desync in validation client => desync in QTPFS |
2012-11-27 23:06 | abma | Relationship added | related to 0003071 |
2012-11-27 23:06 | abma | Note Added: 0009405 | |
2012-11-27 23:11 | abma | Note Added: 0009406 | |
2012-11-28 00:04 | abma | Note Added: 0009407 | |
2012-11-28 12:19 | abma | Note Edited: 0009407 | View Revisions |
2012-11-28 23:44 | abma | Note Added: 0009409 | |
2012-11-28 23:49 | abma | Note Added: 0009410 | |
2012-11-28 23:52 | abma | Note Added: 0009411 | |
2012-11-28 23:55 | abma | Note Edited: 0009410 | View Revisions |
2012-11-29 02:12 | abma | Note Added: 0009412 | |
2012-11-29 02:31 | abma | Note Added: 0009413 | |
2012-11-29 02:32 | abma | Note Edited: 0009413 | View Revisions |
2012-11-29 02:33 | abma | Summary | desync in QTPFS => desync in QTPFS with validation client sharing cache when no cache exists |
2012-11-29 04:28 | abma | Note Added: 0009414 | |
2012-11-29 17:43 | Kloot | File Added: QTPFS-DesyncFix.diff | |
2012-11-29 17:44 | Kloot | Note Added: 0009415 | |
2012-11-29 17:45 | Kloot | Note Edited: 0009415 | View Revisions |
2012-11-29 17:59 | abma | Note Added: 0009416 | |
2012-11-29 20:19 | Kloot | Note Added: 0009417 | |
2012-11-29 20:22 | abma | Note Added: 0009418 | |
2012-11-29 20:22 | abma | Status | new => resolved |
2012-11-29 20:22 | abma | Resolution | open => fixed |
2012-11-29 20:22 | abma | Assigned To | => abma |