View Issue Details

IDProjectCategoryView StatusLast Update
0001050Spring engineUnit Scriptingpublic2015-09-28 01:28
Reporterquantum Assigned ToKloot  
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionno change required 
Product Version0.76b1 
Summary0001050: Desync while using the jumpjets script.
DescriptionSteps to reproduce:
 - Get CA r2757 (http://spring.jobjol.nl/show_file.php?id=1427),
 - start a game and give yourself a group of Pyros (corpyro),
 - queue a lot of jumps.
The game will desync pretty soon.
Additional InformationThe other units with jumpjets (corfast, noruas) desync as well.
We tried CA revisions 2879, 2410 and 2575.
TagsNo tags attached.
Checked infolog.txt for Errors

Relationships

related to 0003436 resolvedhokomoko Desync from different float printing on linux and windows 

Activities

quantum

2008-09-21 20:26

reporter   ~0002602

Fixed link: http://spring.jobjol.nl/show_file.php?id=1427

Kloot

2008-09-21 22:36

developer   ~0002603

Traced to the following portion of CUnit::Update() (probably moveType->Update(), I'll know more in a minute):

    if (beingBuilt) {
        return;
    }

    const bool oldInAir = inAir;
    const bool oldInWater = inWater;

    moveType->Update();

    inAir = ((pos.y + height) >= 0.0f);
    inWater = (pos.y <= 0.0f);

    if (inAir != oldInAir) {
        if (inAir) {
            eventHandler.UnitEnteredAir(this);
        } else {
            eventHandler.UnitLeftAir(this);
        }
    }
    if (inWater != oldInWater) {
        if (inWater) {
            eventHandler.UnitEnteredWater(this);
        } else {
            eventHandler.UnitLeftWater(this);
        }
    }

Kloot

2008-09-21 22:42

developer   ~0002604

moveType->Update() indeed causes the desync, so CScriptMoveType::Update() in this case (the Pyro's are under movectrl when jumping).

Kloot

2008-09-21 22:52

developer   ~0002605

... and to be more precise, this part of it:

    owner->UpdateMidPos();

    // don't need the rest if the pos hasn't changed
    if (oldPos == owner->pos) {
        CheckNotify();
        return;
    }

(the expression oldPos == owner->pos evaluates to true on one machine and to false on another)

Kloot

2008-09-21 23:17

developer   ~0002606

owner->pos seems to diverge somewhere prior to the movetype update, it's already desynced when the function gets called.

Kloot

2008-09-22 00:12

developer   ~0002607

I have now also had desyncs resulting from this part of CGame::SimFrame():

    if (luaRules) { luaRules->GameFrame(gs->frameNum); }

That suggests the jumpjet script itself is behaving non-deterministically, but either way it's strange.

Kloot

2008-09-22 00:22

developer   ~0002608

Last edited: 2008-09-22 00:26

Possible untested explanation: unit_jumpjets.lua maintains a table of coroutines indexed by coroutine objects, so the iteration order in UpdateCoroutines() will be different per machine since those objects have different addresses in their respective Lua VM instances ==> unit positions are updated differently across clients.

Kloot

2008-09-22 00:44

developer   ~0002609

Last edited: 2008-09-22 01:01

I quickly rewrote unit_jumpjets.lua as follows:


local coroutines = {}
local coroutineID = 1

local function StartScript(fn)
    local co = coroutine.create(fn)
    
    coroutines[coroutineID] = {co, 0}
    coroutineID = coroutineID + 1
end

local function UpdateCoroutines()
    for idx = 1, #coroutines, 1 do
        if (coroutines[idx][1] ~= nil) then
            co = coroutines[idx][1]
            sleepLeft = coroutines[idx][2]
    
            if (coroutine.status(co) == "dead") then
                coroutines[idx][1] = nil
            elseif (sleepLeft <= 0) then
                local success, sleep = assert(coroutine.resume(co))
                coroutines[idx][2] = sleep or 0
            else
                coroutines[idx][2] = sleepLeft - 1
            end
        end
    end
end


Then I ordered 50 corpyro's to jump in unison repeatedly, no further desyncs happened. ;)

imbaczek

2008-09-22 01:06

reporter   ~0002610

good job. too bad this doesn't lead anywhere with other desyncs.

quantum

2008-09-22 01:45

reporter   ~0002611

Nice! Works now :) Wow, so pairs is unsafe with tables, coroutines, functions and userdata as keys. I'll need to check my other scripts.

Kloot

2008-09-22 09:18

developer   ~0002612

The execution order of synced code in general has to be well-defined for every client, since the engine can otherwise end up taking different code paths and then things will spiral out of control very quickly (as was the problem here).

imbaczek: in light of this the other desyncs might not even be Spring's fault, but without anything else to go on that's just speculation.

Issue History

Date Modified Username Field Change
2008-09-21 20:25 quantum New Issue
2008-09-21 20:26 quantum Note Added: 0002602
2008-09-21 22:06 Kloot Status new => assigned
2008-09-21 22:06 Kloot Assigned To => Kloot
2008-09-21 22:36 Kloot Note Added: 0002603
2008-09-21 22:42 Kloot Note Added: 0002604
2008-09-21 22:52 Kloot Note Added: 0002605
2008-09-21 23:17 Kloot Note Added: 0002606
2008-09-22 00:12 Kloot Note Added: 0002607
2008-09-22 00:22 Kloot Note Added: 0002608
2008-09-22 00:24 Kloot Note Edited: 0002608
2008-09-22 00:26 Kloot Note Edited: 0002608
2008-09-22 00:44 Kloot Note Added: 0002609
2008-09-22 00:45 Kloot Status assigned => feedback
2008-09-22 00:46 Kloot Note Edited: 0002609
2008-09-22 01:01 Kloot Note Edited: 0002609
2008-09-22 01:06 imbaczek Note Added: 0002610
2008-09-22 01:45 quantum Note Added: 0002611
2008-09-22 09:18 Kloot Note Added: 0002612
2008-09-22 09:20 Kloot Status feedback => resolved
2008-09-22 09:20 Kloot Resolution open => no change required
2015-09-28 01:28 abma Relationship added related to 0003436