Dev meeting minutes 2010-10-10 - Page 2

Dev meeting minutes 2010-10-10

Minutes of the meetings between Spring developers are archived here.
zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: Dev meeting minutes 2010-10-10

Post by zerver » 13 Oct 2010, 12:59

Pxtl absolutely has a point, but I think there is no way around. If you are allowed to rely on changes taking effect immediately, this makes it impossible to thread the engine. One good thing about the event approach is that the necessary code changes could be applied gradually to the engine. If it was taken to the extreme level, so that objects are not even allowed to modify themselves directly, it also removes the need for data duplication.
0 x

User avatar
SpliFF
Posts: 1224
Joined: 28 Jul 2008, 06:51

Re: Dev meeting minutes 2010-10-10

Post by SpliFF » 13 Oct 2010, 13:54

Question: Is it proven we actually have a problem here that requires solving?

What I mean by that is it looks like most of the problems are related to spliting the Lua state across cores - but has it been shown yet that the lua-related sim operations need more than one core?

So imagine for arguments sake we are able to run LOS, Path calculations, Sound, AI, Model loading, Net code, GL and widgets in other threads that leaves us with what in the main (synced) thread?

* simple movement
* COB/UnitScript
* collision detection
* synced and unsynced gadgets
* various other events

That isn't really all that much to tax any single CPU core.

LOS and path finding might be hard to move or not but the real question is that harder than splitting Lua? I suspect the answer is they are easier to move because:

1.) LOS and Pathfinding are long-running equations with low-overhead results. That means they should theoretically be able to run difficult calculations and resync/lock only when they have results (or every slow update or whatever).

2.) Changes to either won't affect backwards-compatibility with games since gadgets never had direct access to the internals of these systems anyway.

So assuming for a moment this is a sensible approach the question is how hard do lua scripts tax the game? Presumably this shouldn't be too hard to determine if pathfinding, sound, GL and LOS can be temporarily disabled.

What do you think, is Lua really the problem here?
0 x

User avatar
hoijui
Former Engine Dev
Posts: 4342
Joined: 22 Sep 2007, 09:51

Re: Dev meeting minutes 2010-10-10

Post by hoijui » 13 Oct 2010, 14:08

this is not primary about Lua, but about the simulation to be made MT-able. LOS updates and Path-finding are not long running operations, done over multiple frames. what you might be thinking about, is the changes to the big scale pre-calculated paths, that have to be recalculated on map deformation. this is not one of the main issues i guess, and it neither is running over multiple frames.
spring is mostly CPU limited. by today, the number of threads we have may be enough already in spring-MT (rendering, sim, sound, .. i don't know). but with more then 4 cores, this is just not gonna scale well. in a few years, with 16 core+ machines being common. we have to be able to run sim in parallel to scale (and keep up with SC 5).
to me, undoubtedly needed.
0 x

User avatar
zwzsg
Kernel Panic Co-Developer
Posts: 7017
Joined: 16 Nov 2004, 13:08

Re: Dev meeting minutes 2010-10-10

Post by zwzsg » 13 Oct 2010, 14:27

Ah. That seems more sensible... but doesn't it also mean that any existing code that depends on the results of an action being visible is invalidated?

That is, if a unit depletes N energy, they won't actually register the N is depleted until the next cycle, instead of the next line?
I have plenty of Lua gadget code that create a unit then immediatly set its properties, queues, etc...

Will I be required to rewrite them all, implementing delayed task queues within the gadget, instead of simple consecutive commands, to merely stay compatible with the single threaded new version of the engine?

Because that would be mightily annoying. It would turn 10 ten lines gadgets into 100 lines monsters. All those easy gadget job would become nighmarish mess. Instead of writing what you want the gadget to do, you'd first have to understand and develop a delayed actions Lua framework before adding any relevant Lua command. The gadgets would be much harder to write, and much harder to reread.
0 x

User avatar
aegis
Posts: 2456
Joined: 11 Jul 2007, 17:47

Re: Dev meeting minutes 2010-10-10

Post by aegis » 13 Oct 2010, 18:17

if all actions must be delayed, the default behavior can delay them.
0 x

zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: Dev meeting minutes 2010-10-10

Post by zerver » 13 Oct 2010, 19:05

That is, if a unit depletes N energy, they won't actually register the N is depleted until the next cycle, instead of the next line?
Many bugs like this one would arise, but they can all be worked around. For the resources, the solution might be to allow temporary borrowing, so that you can end up with negative amount.
Units driving into each other because they both think the spot is free is a bigger issue.
0 x

User avatar
Pxtl
Posts: 6112
Joined: 23 Oct 2004, 01:43

Re: Dev meeting minutes 2010-10-10

Post by Pxtl » 13 Oct 2010, 20:36

@ zwzsg

I think this is more about Lua unit-script rather than gadget code.

@ zerver

Wouldn't the algorithm I described actually work for making the change to Lua unit-script transparent to the user?

That is, start all the Lua unit-scripts with whatever parallelization desired. The moment a Lua unit-script hits any shared resource that would be ambiguous (writes are handled as messages, but reads are ambiguities) that script sleeps.

Then you process the "write" messages.

Then, you create a synced ordered list of all units. You go through each unit and check if it has a sleeping script, and if so, you let it run to completion before starting the next. You process the "write" messages in-place on the spot.

So, parallel where possible, sequential when necessary. Which is what I understood the original idea to be.

You might have to change the API of things to properly decouple reads from writes, but the *logic* of existing lua-scripts could remain as-is, since the parallelization is invisible to the developer.
0 x

SirMaverick
Posts: 834
Joined: 19 May 2009, 21:10

Re: Dev meeting minutes 2010-10-10

Post by SirMaverick » 13 Oct 2010, 22:37

zwzsg wrote:I have plenty of Lua gadget code that create a unit then immediatly set its properties, queues, etc...

Will I be required to rewrite them all, implementing delayed task queues within the gadget, instead of simple consecutive commands, to merely stay compatible with the single threaded new version of the engine?

Because that would be mightily annoying. It would turn 10 ten lines gadgets into 100 lines monsters. All those easy gadget job would become nighmarish mess. Instead of writing what you want the gadget to do, you'd first have to understand and develop a delayed actions Lua framework before adding any relevant Lua command. The gadgets would be much harder to write, and much harder to reread.
It won't work in the same way, so modifications will be necessary. Setting properties for created units could be done in UnitCreated(), you can select which ones by unitdefid. If that is not enough something like tags could be introduced. When call CreateUnit() you supply a tag and check for it in UnitedCreated(). Of course you need additional bookkeeping about that (way less than 100 lines).

Not going sim-MT will in future result in a game running one core even if 16+ are available (15 gml threads don't speedup your simulation). I don't think that's acceptable.
hoijui wrote:spring is mostly CPU limited. by today, the number of threads we have may be enough already in spring-MT (rendering, sim, sound, .. i don't know). but with more then 4 cores, this is just not gonna scale well. in a few years, with 16 core+ machines being common. we have to be able to run sim in parallel to scale (and keep up with SC 5).
to me, undoubtedly needed.
0 x

Kloot
Spring Developer
Posts: 1865
Joined: 08 Oct 2006, 16:58

Re: Dev meeting minutes 2010-10-10

Post by Kloot » 13 Oct 2010, 23:11

Anyone who does not have hard experience debugging sync errors (which I personally will simply stop doing if this development direction is taken) in Spring is in no position to state what's "acceptable" or not for this project.

Of particular note: the OpenTTD project has consistently rejected any and all ideas about MT'ing simulation code, and Spring is certainly no less complex.
0 x

User avatar
Argh
Posts: 10920
Joined: 21 Feb 2005, 03:38

Re: Dev meeting minutes 2010-10-10

Post by Argh » 14 Oct 2010, 01:00

Before people start threatening to leave and stuff... I've got a fairly straightforward question.

Has anybody profiled what the typical load is for sim by itself, all OpenGL operations subtracted from the picture?

I'd be pretty surprised if it was an overwhelming part of the total right now, other than peak events. If we have things like CEG (which needs to be rewritten anyhow, using POPS as the basis, frankly) and Unit geometry handling on another CPU, then I think that you're going to see more benefit in splitting rendering further than you will from splitting even one of the nastier Sim areas, such as LOS, to run on another processor atm.

I really think you folks need to get some numbers together before you go in that direction, basically.

Anyhow, that's what I think. Glad to see people are enthusiastic and wanting to get the problems solved, but when somebody like Kloot's saying, "I don't even want to have to bother trying to help debug this", perhaps a little bit of analysis might make the case either sell itself or be largely moot.
0 x

User avatar
aegis
Posts: 2456
Joined: 11 Jul 2007, 17:47

Re: Dev meeting minutes 2010-10-10

Post by aegis » 14 Oct 2010, 02:03

Argh wrote:Has anybody profiled what the typical load is for sim by itself, all OpenGL operations subtracted from the picture?
run spring-headless with a replay. make sure interface drawing is disabled (with headless_setup.lua or equivalent... unless this was fixed in spring)
0 x

zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: Dev meeting minutes 2010-10-10

Post by zerver » 14 Oct 2010, 03:15

In the regular Spring version, the CPU load displayed incorporates 2FPS of rendering, but still gives a fairly good estimate of the pure Sim load. In many games it can be in the ~80% region even with a decent PC. Spikes of 100% during large scale air attacks are also common.
0 x

User avatar
SpliFF
Posts: 1224
Joined: 28 Jul 2008, 06:51

Re: Dev meeting minutes 2010-10-10

Post by SpliFF » 14 Oct 2010, 03:34

The air attack example given by zerver sounds like a LOS/movement thing rather than anything related to Lua. I'm still not convinced of the need the change the default Lua setup. However out of respect to those who have spent far more time on this problem than I have I'm not going to comment further until I've had a chance to run some tests and benchmarks.
0 x

User avatar
Argh
Posts: 10920
Joined: 21 Feb 2005, 03:38

Re: Dev meeting minutes 2010-10-10

Post by Argh » 14 Oct 2010, 03:36

Spikes of 100% during large scale air attacks are also common.
That says more about all the horrible things that are wrong with aircraft than anything else. That's an area of this engine that needs a total rethink, but most of us have known that for years now.

Nor is splitting it going to help much, imo. At best, it's a brute-force solution.

Aircraft bring up all the worst problems that haven't ever been resolved- sim running their movements, LOS handling of fast-moving objects, collisions.

In the area of LOS, why haven't we used a merged LOS yet? It would save so much CPU, and using something else for targeting (raytests with local objects / map collision mesh) would be far cheaper when looking at the global picture, since targeting events are relatively rare, but LOS is everywhere all the time. We talked about this, Tobi even agreed it would probably work better than what we've got, and nothing happened. That was quite awhile ago; frankly, I think that before people split it into MT, this issue needs to be re-examined.

I'm not convinced that either of these areas presents a good area to split until the fundamentals have been examined again.
0 x

User avatar
zwzsg
Kernel Panic Co-Developer
Posts: 7017
Joined: 16 Nov 2004, 13:08

Re: Dev meeting minutes 2010-10-10

Post by zwzsg » 14 Oct 2010, 10:11

Well, it's much more enticing for a dev to go "I'm gonna add multithreading!", than "I'm gonna revise boring old code". Remember, Spring being open source, the devs just work on what they wanna work on, not necessarly on what you would benefit Spring most.
0 x

User avatar
hoijui
Former Engine Dev
Posts: 4342
Joined: 22 Sep 2007, 09:51

Re: Dev meeting minutes 2010-10-10

Post by hoijui » 14 Oct 2010, 11:06

among spring engine devs, it seems to be accepted that making Sim MT would be important to scale well (in the future). Kloot and Tobi have said they still do not want it, because of increased complexity on maintenance and (sync-)debugging.
Then of course there is also Argh, who has no idea but still doubts what we are saying due to ... some feeling, i guess.

Look:
MT Sim only makes sense if it needs less wall-clock-time then single threaded Sim. if it does so, it is positive to have*. Optimizing other code can not possibly be a contra argument.

* When only watching performance; neglecting for a moment the downside mentioned by Kloot.


back on topic again:
With zervers event approach, we would have a clean, well-defined set of changes for each frame, on each machine, plus definitive states. these two things had to be equal on all machines. if they are, and there still is desync, it has to be hardware related. if they are not equal, we could run an algorithm checking for differences in the change-sets(events), so we would at least know which member of eg a unit is problematic.
so what would be possible problems regarding MT Sim desync?
i could imagine: two events get recorded, one doing +100, the other -50 on unitX.a. but unitX.a may not be higher then current value +75. the problem in this case could be solved by applying range checks only after all events regarding a member are processed, or by having the events strictly ordered (eg, each source of events gets an ID).

There sure be more, and more ugly problems.. am too dull at the moment... Kloot, could you please give some samples?
0 x

zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: Dev meeting minutes 2010-10-10

Post by zerver » 14 Oct 2010, 13:09

Integer changes can be handled like hoi said, floats however must always be sorted. Simplest approach is probably to sort them by value, rather than by some ID.
If the changes would be absolute rather than incremental, integers would of course have to be sorted as well.
0 x

User avatar
FLOZi
MC: Legacy & Spring 1944 Developer
Posts: 6109
Joined: 29 Apr 2005, 01:14

Re: Dev meeting minutes 2010-10-10

Post by FLOZi » 16 Oct 2010, 18:03

I notice the 'enhanced communication between content / engine devs' has been dropped from the agenda.

If you're still waiting on example questions from devs I say just go ahead and make the forum with whatever examples are already done; as we on the other side are still pretty much in the dark about what is even going on with this forum. :(
0 x

User avatar
Forboding Angel
Evolution RTS Developer
Posts: 14599
Joined: 17 Nov 2005, 02:43

Re: Dev meeting minutes 2010-10-10

Post by Forboding Angel » 16 Oct 2010, 22:13

Yeah I noticed that too, and was rather unhappy about it, but I was holding my tongue and not bitching about it because I thought something else might be going on, but yeah guys... What gives?
0 x

Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: Dev meeting minutes 2010-10-10

Post by Tobi » 17 Oct 2010, 17:20

We just skipped it because not the right people were present.

We have just discussed it again, I will create the forum soon.
0 x

Post Reply

Return to “Meeting Minutes”