Spring-MT goes into the wrong direction (commit a628b043)

Spring-MT goes into the wrong direction (commit a628b043)

Discussion between Spring developers.
Post Reply
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Spring-MT goes into the wrong direction (commit a628b043)

Post by jK »

Seems discussing this topic in a chat isn't extensive enough.
And zerver still works into the wrong directions.

The commit: a628b043

First this should be done in a branch, because it's a quite huge project and atm it just breaks all lua code and even c++ code. Also splitting the LuaHandle should be the last step of the whole project, because it just breaks things w/o helping at all before the rest is done. So you should first make a duplicatable sync state, then a lot unsynced code have to be remanaged too (e.g. GetUnitDrawPos which is unsynced but accesses the synced state), ...
At the end, this project is huge and unlikely to be finished at all.
Also into my mind this thing is not even necessary, cause it tries to solve the issue on the wrong level.

So the current problem in spring-mt is that it is possible to have great FPS with a very low SimFPS. This happens because the render-thread blocks the sim-thread very often.
Your (zerver's) idea is now to uncouple the render-thread (and esp. the lua in it) from the sim-thread, so the sim-thread can run more cleanly parallel to the render-thread. In theory this is quite nice and noble, but as said it might have less to do with the problem itself.

Okay, the sim-thread is blocked by the render-thread, but why?
So just see what the render-thread does: it tries to render as much as possible videoframes and so it tries to run continuous. Now we have the sim-thread, it just runs 32 times per second and needs ~20-60% of one core, so it sleeps most of the time.
Note: the following is based on the assumption that boost::mutex yields if it has to wait for a lock! (their docs say the behaviour of boost::mutex is unspecified, so this assumptions is valid, still I can't test it myself because of missing hardware)
On runtime this means that when the sim-thread awakes it is very likely that the render-thread set a mutex and so sim-thread goes sleeping again and waits for the next tick if a mutex is set, and again it is very likely that it is, ... until randomly the mutex isn't set (during lua draw callins this chance is zero, and during engine rendering this chance is just low). So the problem is not that the render-thread blocks the sim-thread, the problem is that the render-thread doesn't know if the sim-thread is waiting. A fast solution is to set a bit when the sim-thread waits for a mutex and to check this in the render-thread each time it tries to create a new mutex, if then the bit is set the render-thread should yield and give the sim-thread time to do its job.


Btw the project of separating the render- & sim-thread is still important and should be continued, but with other priorities. Because this whole issues also shows a different issues with the current threading model: because the sim-thread can work parallel to the render-thread, it means that the synced state can differ between single stages of a renderframe (units can be alive in the shadow pass and dead in the DrawScreen pass etc.), this is a huge bug- and crash-trap. The only solution for this is to create a synced clone for the whole renderframe which just gets updated between renderframes and not in the middle of one. In this context, splitting the LuaHandle gets a very low priority, esp. because all the work it causes.


Also in general the goals of Spring-MT shouldn't be kept on having a parallel sim- & render-thread. Instead it should also be to split those into smaller jobs which can be further multithreaded, to make full usage of all X cores. Having a job-queue multiple worker-threads could process sim & render jobs at the same time, and increase the performance massively. To reach such a goal it isn't necessary to rewrite the whole engine, a lot things could already be moved into such jobs w/o much work (LOS/extra-texture update, shadowmap rendering, particle update, ...).
In 1 month I will get a new (triple-core) PC, and will be able to help, too.
zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: Spring-MT goes into the wrong direction (commit a628b043)

Post by zerver »

Nice essay formatting! :mrgreen: Really interesting discussion. I know this breaks stuff, and that is exactly why I decided to have a flag DUAL_LUA_STATES, that disables all the changes. So no reason to be pissed off, if anyone happens to be that. I wanted to commit this now, as merging later might be lots of work.

The LUA mutexes are in fact the #1 reason why draw call-ins slow down the sim. The solution you are suggesting, "set a bit when the sim-thread waits for a mutex and to check this in the render-thread each time it tries to create a new mutex", does not solve the problem in any way. It cannot even be done, as your assumptions are totally wrong. Mutexes simply do not work like that. Threads that want to lock a mutex end up waiting in a queue-like fasion and will begin running immediately once the lock is released.

Bottom line: The sim frequently wants to execute LUA call-ins. A draw call-in that requires any siginficant execution time will act as gravel in the machinery and slow down the sim. Especially with high FPS the problem gets severe. My commit targets this specific problem. The lua mutexes are in essence gone, now only needed for XCall's.

Also, sim does not necessarily sleep most of the time, I have a decent system, and in many games it maxes out at 80%+. And that is without any draw call-ins.
Post Reply

Return to “Dedicated Developer Discussion”