exception handling, r7053

exception handling, r7053

Discuss the source code and development of Spring Engine in general from a technical point of view. Patches go here too.

Moderator: Moderators

Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

exception handling, r7053

Post by Tobi »

I see that since r7053 we use setjmp/longjmp exception handling for LUA, and normal exception handling for the rest of the engine, and I see a comment that says exception handling crashes for GCC >= 4.2:
--- trunk/rts/lib/lua/include/luaconf.h 2008-11-16 15:42:39 UTC (rev 7052)
+++ trunk/rts/lib/lua/include/luaconf.h 2008-11-16 16:57:26 UTC (rev 7053)
@@ -615,7 +615,7 @@
** compiling as C++ code, with _longjmp/_setjmp when asked to use them,
** and with longjmp/setjmp otherwise.
*/
-#if defined(__cplusplus)
+#if defined(__cplusplus) && !(defined(__GNUC__) && (__GNUC__ == 4)) // FIXME: Some bug in GCC 4.2, 4.3, ... makes try/catch crash
/* C++ exceptions */
#define LUAI_THROW(L,c) throw(c)
#define LUAI_TRY(L,c,a) try { a } catch(...) \
Exception handling works fine however on SUSE / GCC 4.2

What are details of this crash? stacktrace? test case? Link to GCC bug report?

Are we 100% sure that using longjmp/setjmp AND exception handling works correctly together? Do destructors get called when longjmp'ing out of functions? (Hint: No, they don't.) If not, are we sure the crash isn't just memory corruption that shows up in some destructor?
zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: exception handling, r7053

Post by zerver »

Test cases are as follows.

1. At the top of luaV_gettable, insert code that calls the following after sim has been running for a minute.

Code: Select all

luaG_typeerror(L, t, "index");
If it worked correclty, the errors should be spammed on the info console, but it will crash.

2. In luaD_rawrunprotected before LUAI_TRY, insert code that calls the following after sim has been running for a minute.

Code: Select all

  try {
    throw 1;
  }
  catch(int) {
    while(1)
      ;
  }
The game should hang, but I'm afraid it will crash instead.

The purpose of delaying the crash till one minute ingame was to make sure the infoconsole has been properly initialized. I think #2 should crash even without any delay but I have not tried it yet.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

Thanks, I'm gonna take a look at this when I'm home.

(The last test case definitely doesn't crash on SUSE GCC 4.2.1 outside Spring, but I don't have Spring source nearby to test that here..)
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: exception handling, r7053

Post by imbaczek »

Tobi: http://spring.clan-sy.com/mantis/view.php?id=1160

I believe this is an issue with mingw exception handling, didn't try dwarf2 exceptions (sjlj is the default on windows). I don't know if setjmp/longjmp semantics interfere with any spring code since I don't know if public Lua functions are allowed to use it (I'd hope not.) Either way, those crashes happen and are nearly impossible to reproduce without hacks like zerver's.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

Here are some sjlj and dwarf2 builds of MinGW. Would be interesting to compare those...

Does the cmake buildsystem pass '-mthreads' to the linker? (or '-mwindows'?)

Various sources seems to suggest without this the generated code / runtime system itself isn't thread safe so in that case, no wonder it crashes. (The scons buildsystem passes neither of those.)
imbaczek wrote:I don't know if setjmp/longjmp semantics interfere with any spring code since I don't know if public Lua functions are allowed to use it (I'd hope not.)
I suspect they can cause some memory leaks here and there. There is LUA code which calls C++ code which calls LUA code again. Now I don't know if this uses protected call or unprotected call, but if it uses unprotected call then any non-LUA objects in this C++ code won't be destructed (e.g. std::string, std::vector) in case an error occurs in the inner LUA code, AFAICS.

The other way around this applies too, if an engine exception propagates through LUA stack frames then LUA can not clean up (because it only catches it's own 'exceptions') and LUA may end up in undefined state? Luckily engine exceptions are usually fatal anyway..
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: exception handling, r7053

Post by imbaczek »

scons should add -mthreads everywhere, I remember checking and fixing this. It doesn't seem to help, though.

Previously forgot that Lua callins may trigger errors; lua_error family of functions does long jumps (not sure if it does so by throwing exceptions).
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

oh right stupid me, I checked whether it was present in build log of a linux build :-)

Indeed in scons MinGW builds it is present.
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: exception handling, r7053

Post by imbaczek »

gcc on linux doesn't even accept -mthreads as a valid option.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

r7061 wrote: Modified:
trunk/CMakeLists.txt
Log:
* CMake: compile with -mwindows in mingw
So it didn't use this nor -mthreads before?

Maybe someone could test normal exception handling again then, and see whether it still crashes?

(I didn't get proper MinGW GCC >= 4 dev env up yet...)
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: exception handling, r7053

Post by imbaczek »

it always crashed in scons, which has provided proper options for quite some time IIRC. (always = started shortly before 77b1 for me, actually, if the vfs crash is the same thing...)
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

I looked at this but didn't get really far yet cause of huge regression in debug binary size. Dunno if it's cause Spring code got bigger or because of newer GCC (probably both), but I can barely link / debug it anymore on the windows box I used for this. (spring.exe 243 MB, ld.exe taking close to 400 MB RAM, and that on 512 MB RAM crappy old laptop ^^)

Might be continued if I can be bothered to wait or put up devenv on better (virtual) machine :-)
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: exception handling, r7053

Post by imbaczek »

i can confirm HUEG debug symbols. i think it may be because -ggdb3, but haven't tested it.
Auswaschbar
Spring Developer
Posts: 1254
Joined: 24 Jun 2007, 08:34

Re: exception handling, r7053

Post by Auswaschbar »

never tried -ggdb3, but 270mb binary size seems completely reasonable. I normally have between 70 and 120 mb with ggdb2
User avatar
hoijui
Former Engine Dev
Posts: 4344
Joined: 22 Sep 2007, 09:51

Re: exception handling, r7053

Post by hoijui »

i get around 160MB with dwarf
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

Got some actual testing done; I can NOT reproduce the broken behaviour reported by zerver.

Testcase 1 works fine, it creates epic LUA error spam (tested up to 10000 errors) - as expected - and doesn't crash.
(With one exception, that is if a new LuaParser is created, it will crash, but this is to be expected because the LuaParser fails to open the standard libraries and it assumes this always succeeds: in practice this occurs when a team dies, and Spring wants to parse messages.lua)

Testcase 2 works fine, it makes Spring run an infinite loop - as expected -, and doesn't crash.

I used GCC 4.3.2-tdm-1 with SJLJ exception unwinding from TDM's experimental GCC builds.

I tested this in both 'scons configure debug=yes' and 'scons configure optimize=1 debug=yes' builds, tests work fine in both. Will test optimize=2, optimize=3, optimize=s, and maybe dwarf2 unwinding and/or the official MinGW GCC build some other time.

May I ask in which GCC version(s) this error does happen?
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

Got it reproduced in a optimize=2 build now. Seems to work fine most of the time tho, only crashed once in 20000 errors or so...

Didn't try testcase 2 yet in optimize=2 build.

Still to me it seems like memory corruption that happens to show when handling huge amounts of exceptions, so I bet the original fix just hides a symptom and does not fix the issue itself. (IOW, it will probably keep producing random crashes / desyncs)

I'll see if I can run a few 100000 lua errors in valgrind sometime.
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: exception handling, r7053

Post by imbaczek »

I think it's corruption too, judging from addresses to which LUA_THROW jumps. The problem is pinpointing it, especially without a working valgrind on windows.
zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: exception handling, r7053

Post by zerver »

I'm joining the bug hunt.

When I tried this last, it seemed to crash "every time". I hope it is still like that :mrgreen:
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: exception handling, r7053

Post by Tobi »

Not for me... now tried a million errors, no crash. (But 168 MB infolog..)
So for me it's now about 1 crash in about 1,050,000 errors...

Maybe it depends on mod or settings?

I've been testing with XTA v9.53 on map nano arena.
zerver
Spring Developer
Posts: 1358
Joined: 16 Dec 2006, 20:59

Re: exception handling, r7053

Post by zerver »

I have found that it is the threading in the GML version that makes this crash *much* more probable. I'm suspecting either a bug in the libraries that are used for exception handling, or maybe we are using the wrong (single-threaded?) version of some lib. It could also be that we are missing yet another compiler flag.
Since spring uses threads not only for GML, it could explain the fact that it crashes also with a non-GML version. It could be that any concurrent try-catching potentially will cause a crash...

And there was a tremendous increase in stability for the GML version when I switched to longjmp exception handling :mrgreen:. Not a single crash this week.
Post Reply

Return to “Engine”