2019-08-19 22:46 CEST

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0001050Spring engineUnit Scriptingpublic2015-09-28 01:28
Reporterquantum 
Assigned ToKloot 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionno change required 
Product Version0.76b1 
Target VersionFixed in Version 
Summary0001050: Desync while using the jumpjets script.
DescriptionSteps to reproduce:
 - Get CA r2757 (http://spring.jobjol.nl/show_file.php?id=1427),
 - start a game and give yourself a group of Pyros (corpyro),
 - queue a lot of jumps.
The game will desync pretty soon.
Additional InformationThe other units with jumpjets (corfast, noruas) desync as well.
We tried CA revisions 2879, 2410 and 2575.
TagsNo tags attached.
Checked infolog.txt for lua Errors
Attached Files

-Relationships
related to 0003436resolvedhokomoko Desync from different float printing on linux and windows 
+Relationships

-Notes

~0002602

quantum (reporter)

Fixed link: http://spring.jobjol.nl/show_file.php?id=1427

~0002603

Kloot (developer)

Traced to the following portion of CUnit::Update() (probably moveType->Update(), I'll know more in a minute):

    if (beingBuilt) {
        return;
    }

    const bool oldInAir = inAir;
    const bool oldInWater = inWater;

    moveType->Update();

    inAir = ((pos.y + height) >= 0.0f);
    inWater = (pos.y <= 0.0f);

    if (inAir != oldInAir) {
        if (inAir) {
            eventHandler.UnitEnteredAir(this);
        } else {
            eventHandler.UnitLeftAir(this);
        }
    }
    if (inWater != oldInWater) {
        if (inWater) {
            eventHandler.UnitEnteredWater(this);
        } else {
            eventHandler.UnitLeftWater(this);
        }
    }

~0002604

Kloot (developer)

moveType->Update() indeed causes the desync, so CScriptMoveType::Update() in this case (the Pyro's are under movectrl when jumping).

~0002605

Kloot (developer)

... and to be more precise, this part of it:

    owner->UpdateMidPos();

    // don't need the rest if the pos hasn't changed
    if (oldPos == owner->pos) {
        CheckNotify();
        return;
    }

(the expression oldPos == owner->pos evaluates to true on one machine and to false on another)

~0002606

Kloot (developer)

owner->pos seems to diverge somewhere prior to the movetype update, it's already desynced when the function gets called.

~0002607

Kloot (developer)

I have now also had desyncs resulting from this part of CGame::SimFrame():

    if (luaRules) { luaRules->GameFrame(gs->frameNum); }

That suggests the jumpjet script itself is behaving non-deterministically, but either way it's strange.

~0002608

Kloot (developer)

Last edited: 2008-09-22 00:26

Possible untested explanation: unit_jumpjets.lua maintains a table of coroutines indexed by coroutine objects, so the iteration order in UpdateCoroutines() will be different per machine since those objects have different addresses in their respective Lua VM instances ==> unit positions are updated differently across clients.

~0002609

Kloot (developer)

Last edited: 2008-09-22 01:01

I quickly rewrote unit_jumpjets.lua as follows:


local coroutines = {}
local coroutineID = 1

local function StartScript(fn)
    local co = coroutine.create(fn)
    
    coroutines[coroutineID] = {co, 0}
    coroutineID = coroutineID + 1
end

local function UpdateCoroutines()
    for idx = 1, #coroutines, 1 do
        if (coroutines[idx][1] ~= nil) then
            co = coroutines[idx][1]
            sleepLeft = coroutines[idx][2]
    
            if (coroutine.status(co) == "dead") then
                coroutines[idx][1] = nil
            elseif (sleepLeft <= 0) then
                local success, sleep = assert(coroutine.resume(co))
                coroutines[idx][2] = sleep or 0
            else
                coroutines[idx][2] = sleepLeft - 1
            end
        end
    end
end


Then I ordered 50 corpyro's to jump in unison repeatedly, no further desyncs happened. ;)

~0002610

imbaczek (reporter)

good job. too bad this doesn't lead anywhere with other desyncs.

~0002611

quantum (reporter)

Nice! Works now :) Wow, so pairs is unsafe with tables, coroutines, functions and userdata as keys. I'll need to check my other scripts.

~0002612

Kloot (developer)

The execution order of synced code in general has to be well-defined for every client, since the engine can otherwise end up taking different code paths and then things will spiral out of control very quickly (as was the problem here).

imbaczek: in light of this the other desyncs might not even be Spring's fault, but without anything else to go on that's just speculation.
+Notes

-Issue History
Date Modified Username Field Change
2008-09-21 20:25 quantum New Issue
2008-09-21 20:26 quantum Note Added: 0002602
2008-09-21 22:06 Kloot Status new => assigned
2008-09-21 22:06 Kloot Assigned To => Kloot
2008-09-21 22:36 Kloot Note Added: 0002603
2008-09-21 22:42 Kloot Note Added: 0002604
2008-09-21 22:52 Kloot Note Added: 0002605
2008-09-21 23:17 Kloot Note Added: 0002606
2008-09-22 00:12 Kloot Note Added: 0002607
2008-09-22 00:22 Kloot Note Added: 0002608
2008-09-22 00:24 Kloot Note Edited: 0002608
2008-09-22 00:26 Kloot Note Edited: 0002608
2008-09-22 00:44 Kloot Note Added: 0002609
2008-09-22 00:45 Kloot Status assigned => feedback
2008-09-22 00:46 Kloot Note Edited: 0002609
2008-09-22 01:01 Kloot Note Edited: 0002609
2008-09-22 01:06 imbaczek Note Added: 0002610
2008-09-22 01:45 quantum Note Added: 0002611
2008-09-22 09:18 Kloot Note Added: 0002612
2008-09-22 09:20 Kloot Status feedback => resolved
2008-09-22 09:20 Kloot Resolution open => no change required
2015-09-28 01:28 abma Relationship added related to 0003436
+Issue History