Before I post a bug on mantis, I was wondering if one of the developers can give instructions on how to read/interpret the stack traces in infolog.txt. I was told by Kloot that these are for the most part meaningless. Is there a way to compile the engine in order to get more information out of a replay re: stack traces and exceptions?
For example, in some games there is a ~3 second stall that accompanies a stack trace that looks like the following infolog.txt excerpt:
[f=0068671] Error: [safe_element const] index 1 out of bounds! (size 1)
[f=0068671] Error: Stacktrace for Spring 96.0:
[f=0068671] Error: <0> /home/joya/dev/spring/rts/Game/UnsyncedActionExecutor.h:42
[f=0068671] Error: <1> /home/joya/dev/spring/rts/Game/ChatMessage.cpp:16
[f=0068671] Error: <2> /usr/include/c++/4.8.2/bits/basic_ios.h:276
[f=0068671] Error: <3> /usr/include/c++/4.8.2/bits/stl_tree.h:1323 (discriminator 2)
[f=0068671] Error: <4> /home/joya/dev/spring/rts/Game/GUI/PlayerRoster.cpp:303
[f=0068671] Error: <5> /usr/include/c++/4.8.2/bits/basic_string.h:583
[f=0068671] Error: <6> /home/joya/dev/spring/rts/Game/Game.cpp:1628
[f=0068671] Error: <7> /home/joya/dev/spring/rts/Net/NetCommands.cpp:517
[f=0068671] Error: <8> /home/joya/dev/spring/rts/Game/Game.cpp:996
[f=0068671] Error: <9> /usr/include/c++/4.8.2/bits/basic_string.h:583
[f=0068671] Error: <10> /home/joya/dev/spring/rts/Game/UnsyncedGameCommands.cpp:1946 (discriminator 58)
[f=0068671] Error: <11> /home/joya/dev/spring/rts/Game/WaitCommandsAI.cpp:67
[f=0068671] Error: <12> /usr/include/boost/exception/exception.hpp:215
[f=0068671] Error: <13> /home/joya/dev/spring/rts/System/Main.cpp:126
[f=0068671] Error: <14> ??:?
[f=0068671] Error: <15> ??:?
This will occur about 1-4 times in a row, causing the engine to freeze momentarily. I think (just guessing) that the server is not encountering the problem since packets continue to arrive smoothly during the freeze; the engine goes to 10x game speed afterward in order to replay / catch up with the simulation state.
The stack traces happen deterministically in the same points when you watch the replay of a game that caused them, so I'm wondering if there is a way for me to compile with some debugging switches that will reveal better / more accurate information.
Thanks,
Mjr.Boredom
Debugging and stack traces
Moderator: Moderators
Re: Debugging and stack traces
it would help if you posted your full infolog.
Re: Debugging and stack traces
It tells you what happened (trying to to access a cell outside an array), and where it happened (the file and line number where it occured, and who called it). What else would you need to know?MajBoredom wrote:these are for the most part meaningless
- Silentwings
- Posts: 3720
- Joined: 25 Oct 2008, 00:23
Re: Debugging and stack traces
Afaik infolog stacktraces are only meaningless when the bug is connected to Spring's stacktrace system itself - as it was on the mantis report where I assume you heard that comment.I was told by Kloot that these are for the most part meaningless.
Re: Debugging and stack traces
looks similar to: http://springrts.com/mantis/view.php?id=4309
also a demo would be good as such bugs are mostly reproduceable with demo.
also a demo would be good as such bugs are mostly reproduceable with demo.
-
- Posts: 17
- Joined: 24 Nov 2013, 09:14
Re: Debugging and stack traces
Thanks. Let's assume I understand the basic premise of a stack trace as presented in an undergraduate CS class.zwzsg wrote:It tells you what happened (trying to to access a cell outside an array), and where it happened (the file and line number where it occured, and who called it). What else would you need to know?MajBoredom wrote:these are for the most part meaningless
The function stack shown in the above message is invalid. For example, there is no call to any UnsyncedActionExecutor function from a ChatMessage function on line 16 of ChatMessage.cpp, nor any other line. Similarly, none of the code in PlayerRoster.cpp calls any stl_tree methods. So it looks to me like I have the wrong symbols or that the stack trace functionality is once again misbehaving.
To confirm, I recompiled the engine and re-ran the replay. I get the same error messages with the same line numbers as before. Any ideas?? The engine's stack trace mechanism -- is there any way to interact with / configure it? Use it with gdb?
abma: It looks similar in the NetCommands.cpp:675 line. However, this line is inside the handler for NETMSG_AICOMMAND_TRACKED. Is this called during normal (multiplayer) settings? This is also what prompts me to think that the stack trace is misleading. However, I also get this:
[f=0032807] Error: [safe_element const] index 1 out of bounds! (size 1)
I think I should file as a separate bug until there is better information. Infolog + replay to follow on Mantis.
Tanks for your support!
-
- Posts: 17
- Joined: 24 Nov 2013, 09:14
Re: Debugging and stack traces
After some playing with libunwind and posix thread signals, I managed to create a stub program that creates a worker thread, runs it, then allows you to suspend that thread from the main thread (sort of Java-like). The suspended thread is just a signal handler for the worker that waits on a mutex previously locked by the main thread.
On top of that, the signal handler fills out a stack context object for the worker that can be interpreted by the main thread. Using libunwind, you can get a backtrace. The output looks as follows:
Notice that libunwind gives different text descriptions (the name of the function rather than just the module). Alternatively, you could use external calls to the addr2line program to make it look like our current setup.
So, if we alter the watchdog thread to use this technique, it means we can get precise stack traces from foreign threads by having the watchdog suspend them just long enough to do the stack trace.
If any BSD experts would like to jump in and tell me if there is an analogous way of doing this on a Mac, please do so. :)
On top of that, the signal handler fills out a stack context object for the worker that can be interpreted by the main thread. Using libunwind, you can get a backtrace. The output looks as follows:
Code: Select all
[Main] Starting worker thread.
[Worker] installing signal handler...
[Worker] Doing important work [1/12]...
[Worker] Doing important work [2/12]...
[Worker] Doing important work [3/12]...
[Worker] Doing important work [4/12]...
[Worker] Doing important work [5/12]...
[Worker] Doing important work [6/12]...
[Main] The worker has been suspended.
[Handler] Caught SIGUSR1!
[Main] Simple backtrace using libunwind:
[Main] ip = 0x3deac0ef90, sp = 0x7f656f5e6800, __restore_rt()
[Main] ip = 0x3deac0e7fd, sp = 0x7f656f5e6da0, __nanosleep_nocancel()
[Main] ip = 0x000040104e, sp = 0x7f656f5e6db0, do_important_work()
[Main] ip = 0x0000400f39, sp = 0x7f656f5e6de0, worker_func()
[Main] ip = 0x3deac07c53, sp = 0x7f656f5e6f20, start_thread()
[Main] ip = 0x3dea4f5dbd, sp = 0x7f656f5e6fc0, clone()
[Main] Decoding stack ip values using backtrace_symbols():
[Main] /lib64/libpthread.so.0() [0x3deac0ef90]
[Main] /lib64/libpthread.so.0(__nanosleep+0x2d) [0x3deac0e7fd]
[Main] ./signaltest() [0x40104e]
[Main] ./signaltest() [0x400f39]
[Main] /lib64/libpthread.so.0() [0x3deac07c53]
[Main] /lib64/libc.so.6(clone+0x6d) [0x3dea4f5dbd]
[Main] Now resuming the worker.
[Worker] Doing important work [7/12]...
[Worker] Doing important work [8/12]...
[Worker] Doing important work [9/12]...
[Worker] Doing important work [10/12]...
[Worker] Doing important work [11/12]...
[Worker] Doing important work [12/12]...
[Main] Finished.
So, if we alter the watchdog thread to use this technique, it means we can get precise stack traces from foreign threads by having the watchdog suspend them just long enough to do the stack trace.
If any BSD experts would like to jump in and tell me if there is an analogous way of doing this on a Mac, please do so. :)
- Attachments
-
- signaltest.c
- C code, compiles with: gcc -g -o signaltest signaltest.c -lpthread -lunwind -lunwind-x86_64
- (6.16 KiB) Downloaded 20 times
Re: Debugging and stack traces
looks very promising!
https://github.com/spring/spring/blob/d ... andler.cpp
for osx... not sure if it will work out of the box. libunwind seems to be used there as well.
but i'm unsure if it will work with springs threadpool thing, jk knows this i think.
feeding the addresses to addr2line seems to work as well:
[Main] ip = 0x0000400f97, sp = 0x7f25efcd0d60, do_important_work()
[Main] ip = 0x0000400f35, sp = 0x7f25efcd0d90, worker_func()
addr2line -e signaltest
0x0000400f97
/tmp/9849178/signaltest.c:77
0x0000400f35
/tmp/9849178/signaltest.c:62 (discriminator 2)
code for that already exists, shouldn't be hard to integrate i guess:MajBoredom wrote:Notice that libunwind gives different text descriptions (the name of the function rather than just the module). Alternatively, you could use external calls to the addr2line program to make it look like our current setup.
https://github.com/spring/spring/blob/d ... andler.cpp
for osx... not sure if it will work out of the box. libunwind seems to be used there as well.
but i'm unsure if it will work with springs threadpool thing, jk knows this i think.
feeding the addresses to addr2line seems to work as well:
[Main] ip = 0x0000400f97, sp = 0x7f25efcd0d60, do_important_work()
[Main] ip = 0x0000400f35, sp = 0x7f25efcd0d90, worker_func()
addr2line -e signaltest
0x0000400f97
/tmp/9849178/signaltest.c:77
0x0000400f35
/tmp/9849178/signaltest.c:62 (discriminator 2)