View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0004152 | Spring engine | General | public | 2013-11-13 22:06 | 2013-11-30 23:52 |
| Reporter | Assigned To | Kloot | |||
| Priority | normal | Severity | crash | Reproducibility | N/A |
| Status | resolved | Resolution | fixed | ||
| Product Version | 95.0 | ||||
| Target Version | 96.0 | Fixed in Version | 95.0.1+git | ||
| Summary | 0004152: Crash with "Failed to allocate memory" (mod: XTA 9.731) | ||||
| Description | After 30-45min of playing, spring crashed. spring.exe closed and messagebox from windows "has stopped worked." Have not tested yet if demo crashes too. http://pastebin.com/t1nV0tFY | ||||
| Tags | No tags attached. | ||||
| Attached Files | |||||
| Checked infolog.txt for Errors | |||||
|
|
looks like same crash: http://springrts.com/phpbb/viewtopic.php?f=11&t=31178 |
|
|
hmz, stacktrace failed |
|
|
does the replay reproduce the crash for you? |
|
|
oh sorry 20131113_212307_DeltaSiege_Island_8_Way_95.sdf is actually wrong demo, from previous game. the crashed demo did not get saved.. |
|
|
this is correct demo: http://replays.springrts.com/replay/e6e083525d31ac471a8917dc79b8f899/ during playing it crashed at ~28min demo played ok until end. infolog: http://pastebin.com/YzwvtTks (no crash) spring.exe RAM usage in taskmanager steadily goes up, but maybe that is normal for game of such size. start 530mb 2min 560mb 10min 670mb 13min 720mb 20min 815mb 23min 856mb 26min 890mb 28min 920mb |
|
|
I got an endless loop in glSwapBuffers after ~ 31min |
|
|
That's interesting: at about 31 mins adv.playerlist crashes because it tries to gl.text a name that is nil. I had one crash before also, it was on exactly this map and it seemed to be a GPU timeout. But I played 50 other games with 95 without any crash. |
|
|
in that case, see if you can reproduce it by calling gl.Text(nil) from some widget |
|
|
That only crashes the widget, but while it's not the cause to the crash, it could be a symptom of it. I watched that replay some times now and APL XTA version always does crash, but it doesn't always crash at the same time. Sometimes it crashes when knorke drops 28min, sometimes it crashes when duckoman gets killed 30min. But maybe it doesn't mean anything: BA version of APL doesn't crash in this demo. But hey: why isn't APL removed by WH when it crashes? The test widget is. This is the output from test widget: [f=0000300] [widgets.lua] Error: false [f=0000300] [widgets.lua] Error: Error in DrawScreen(): [string "LuaUI\Widgets\dbg_test.lua"]:40: bad argument #1 to 'Text' (string expected, got nil) [f=0000300] [widgets.lua] Error: Removed widget: Test fields This is the output from APL: [f=0051936] Player [2up]knorke has finished resuming [f=0051975] Error: gl.CreateList: error(2) = [string "LuaUI/Widgets/gui_advplayerslist.lua"]:1227: bad argument #1 to 'gl_Text' (string expected, got nil) [f=0051978] > <[2up]knorke> think i was dead anyway [f=0052005] Error: gl.CreateList: error(2) = [string "LuaUI/Widgets/gui_advplayerslist.lua"]:1227: bad argument #1 to 'gl_Text' (string expected, got nil) |
|
|
because the APL crash happens inside a local function passed to gl.CreateList |
|
|
i do not use adv.playerlist widget, only /info list. [f=0051936] Player [2up]knorke has finished resuming Is that message from engine or lua? I never rejoined the game. Also see chat: [f=0051926] > <Thorgasm> knorke are u going to try and rejoin? [f=0051926] > <[2up]knorke> ah no |
|
|
lua, cmd_idle_players |
|
|
The mesg "[f=0051936] Player [2up]knorke has finished resuming" is from cmd_idle_players, as kloot says. The fact that it appeared for you there is a bug - fixed yesterday (without having read this thread!) by http://imolarpg.dyndns.org/trac/balatest/changeset/1852. Update XTAs cmd_idle_players from the BA repo. But I don't think that error has anything to do with this bug. For what it's worth, I also just got one of these crashes (first time ever) while watching a BA game. It was only me that crashed - no one else. The only abnormal thing I can think of about what I did was I spent much of the game alt-tabbed out and doing smth else. Added infolog + demo. The stacktrace translator failed, but with a "unable to find detailed version info" which makes me think maybe stacktrace translator is broken? |
|
|
"crash 2infolog.txt" and "crash 3infolog.txt" is two more infologs from similiar crashs. both in FFA on somewhat large maps. After rejoin could continue to play. |
|
|
I have watched resource monitor in a game where this happened and it climbed slowly but a at a constant rate. Accordinding to /debug, lua mem usage didn't increase at all. After about 30 mins in game it can be using 3-4Gb of ram. It seems like this bug seems to affect one of my computers all the time and my other one never. |
|
|
One way to test that would be to join games with both computers simultaneously. Which settings (if any) are different on the machine that is affected compared to the one that isn't? Are they running the same operating systems / driver versions (assuming both use graphics hardware from the same vendor) / widgets? |
|
|
Ok, I did just that. Here's results of laptops A and B Laptop A crashed after 20 minutes, having run out of ram, with spring.exe having a working set of 3.8Gb. It's an i7-3740QM, win7, 8Gb ram, Nvidia GeForce GT 650M, driver version 8.17.12.9601. Laptop B made it to the end and was using <650Mb or ram all the way through. It's a i7-M620, win7, 8Gb ram, ATI Radeon HD 6500M, driver version 8.982. The widget configs between the two are basically identical. On both machines lua mem usage (reported by /debug) never went above 10Mb. The spring configs are different, but mainly because the ATI machine needs some stuff disabled or it crashes in multiple ATI related ways. I'll attach both configs. |
|
|
Ironically B should be the one to crash because it doesn't force AllowDeferred*Rendering to 0 and 95 contains a bug related to those parameters which ATI's do not appreciate. Otherwise the only significant difference is that A's config defines these values and B's doesn't: MultiThreadCount=6 MultiThreadLua=1 WorkerThreadCount=6 The first two are no longer relevant (because they correspond to code which is now unmaintained and non-functional) so you can just remove them. OTOH the latter controls the size of the threadpool (new in 95.0) but should not affect Spring's memory footprint. Does setting it to 0 have any measurable impact? |
|
|
I followed pretty much the same train of thought and tried setting WorkerThreadCount=0 and -1 on the Nvidia machine - it didn't change anything. I think the ATI machine is forcing deferred rendering off, because its got advunitshading off (at least, that's my memory of what jK said to me a few nights ago). |
|
|
Adv*Shading=0 does cause both deferred passes to be skipped but the crashbug existed in code that was always reached. In any case, synced code can be safely ruled out as a factor. What happens if you issue /luaui disable on A? (important because widgets can direct *Spring* to allocate memory, not just the Lua VM which has garbage collection) |
|
|
Turning off lauui seems to fix it. With luaui off the rate at which it increases memusage is cut by a factor of, at a guess, 10-20x. Moreover, if I luaui enable, wait a few minutes (during which it quickly soaks up mem) and then /luaui disable, it looks as though the soaked up mem is instantly freed. |
|
|
Hmm, feel like bisecting the widget list? ie. start with the top half of all widgets enabled and the bottom half disabled, figure out if it leaks and if so split the top half into an enabled and a disabled part, etc. |
|
|
Here is Infolog and spring setting, replay to follow as it didn't save on players pc, but it's looks like the same mem leak but this time the whole game was played with luaui disabled http://pastebin.com/zrGSpjae http://pastebin.com/HNxeigeD |
|
|
Im leaving for 2 weeks holiday tonight and won't have time to bisect the widget list before that, but if the bug is still around when I get back I could do it then. Sorry! |
|
|
here replay from above game where player crashed due to mem leak https://dl.dropboxusercontent.com/u/200970/20131117_133241_Grts_Northernmountains_009_95.sdf |
|
|
So I tried bisecting the widget list but didn't find anything. It seems as though the rate at which mem is leaking increases as more widgets are enabled and the responsibility is shared across all widgets. I don't know at what rate spring is supposed to gradually increase mem usage during a game, but even with all luaui off it still increases at maybe 1Mb every 3 seconds. |
|
|
I also got the failed to allocate memory crash but with another mod: Tech Annihilation 2.23.4 |
|
|
Currently my assumption is that the Lua garbage collector cannot keep up in larger games (because it runs for a limited amount of time per invocation in 95) so the rate at which memory is consumed exceeds the rate at which it is freed. Both the amount of time (MaxLuaGarbageCollectionTime) spent and the number of steps (MaxLuaGarbageCollectionSteps) taken per collection call are configurable in 95+, testing if a combination of params exists to stop leakage over the range of expected game sizes and lengths without slowing Spring to a crawl is my main concern now. 1MB every 3 seconds sounds excessive though, does it leak that fast from game start (with all widgets enabled) or does the rate increase over time? |
|
|
I think the rate at which mem is leaked does increase somewhat as the game gets bigger, but the leak is quite big already at the game start (maybe 1Mb every 8 sec). I have noticed something else though - mem is leaked at a noticeably faster rate when I am fully zoomed out and seeing the whole map than when I am fully zoomed in on some small irrelevent piece of ground. When rejoining a game at high speed, mem is also leaked faster. |
|
|
[quote]I have noticed something else though - mem is leaked at a noticeably faster rate when I am fully zoomed out and seeing the whole map than when I am fully zoomed in on some small irrelevent piece of ground[/quote]Hm funny, my crashes were when I was moving camera a lot out of boredom/for fun. Hold shift and quickly scrolling around map or zooming around. But even from the demos I could not reproduce crash, even tried many programs in background (to use RAM) and wild camera movements. Is there a replay that always (or more reliable) crashes? |
|
|
It only affects one of my two laptops but I can crash reliably with all my RAM used up by spectating any BA game that is larger in size than say, 6v6, and lasts longer than say 35 mins. (And its not a bad laptop - i7, 8GB RAM, GT 650M) The trouble with replays is that with 95.0 they seem not to follow the same course as the game itself so I can't test easily (http://springrts.com/mantis/view.php?id=4140). |
|
|
What should I do to test the lua garbage collection settings? |
|
|
1) start with build 119-g70cf9b4 using the default* values 2) run a game on the affected machine, preferably at +20 speed to magnify the leak (since it leaks from the start you won't have to actually play much) 3) monitor what your OS has to say about the Spring process footprint 4) slowly increase one or both of MaxLuaGarbageCollection{Time,Steps} and go back to step 2 until the leak stabilizes (this of course assumes that Lua causes it *and* that the memory being leaked is even collectable, ie. garbage) * MaxLuaGarbageCollectionTime is in milliseconds and defaults to 33, MaxLuaGarbageCollectionSteps defaults to 10000 |
|
|
I haven't tried the test with the garbage collection time settings yet; BUT I found out which setting triggers the memleak. I had set WorkerThreadCount myself (because I wanted cores free for other things) and this triggers the memleak. I know iI said above that I had tested WorkerThreadCount etc, but this was during the period where SL was (unbeknowst to me) forcing spring to use a different config file to my 'proper' one (in /my games/spring) - so I suspect that the check I did above was flawed. Sorry. With WorkerThreadCount = -1, memleak doesn't happen. With WorkerThreadCount = 0, it memleaks. What shall I do now to test? |
|
|
Should be fixed as of 129-g0b1cbb8, can you (or anyone else) confirm that? (and now you know why running spring.exe directly is often the preferred test-route) |
|
|
For me -129 doesn't leak with WorkerThreadCount=0. |
|
|
good enough for me, thanks |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2013-11-13 22:06 |
|
New Issue | |
| 2013-11-13 22:06 |
|
File Added: 20131113_212307_DeltaSiege_Island_8_Way_95.sdf | |
| 2013-11-13 22:12 |
|
Note Added: 0012115 | |
| 2013-11-13 22:25 | Kloot | Note Added: 0012118 | |
| 2013-11-13 22:36 | jK | Note Added: 0012119 | |
| 2013-11-14 12:14 |
|
Note Added: 0012130 | |
| 2013-11-14 12:30 |
|
Note Added: 0012131 | |
| 2013-11-14 16:00 | jK | Note Added: 0012132 | |
| 2013-11-14 16:00 | jK | Note Edited: 0012132 | |
| 2013-11-14 16:00 | jK | Note Edited: 0012132 | |
| 2013-11-14 16:01 | jK | Note Edited: 0012132 | |
| 2013-11-14 16:35 | Jools | Note Added: 0012133 | |
| 2013-11-14 16:36 | Jools | Note Edited: 0012133 | |
| 2013-11-14 16:37 | Jools | Note Edited: 0012133 | |
| 2013-11-14 17:00 | Kloot | Note Added: 0012134 | |
| 2013-11-14 17:44 | Jools | Note Added: 0012135 | |
| 2013-11-14 18:25 | Kloot | Note Added: 0012136 | |
| 2013-11-14 18:27 | Kloot | Note Edited: 0012136 | |
| 2013-11-16 13:36 |
|
Note Added: 0012145 | |
| 2013-11-16 13:50 | Kloot | Note Added: 0012146 | |
| 2013-11-16 14:33 | silentwings | Note Added: 0012147 | |
| 2013-11-16 14:33 | silentwings | File Added: 20131116_125131_The_Road_Final_95.sdf | |
| 2013-11-16 14:33 | silentwings | File Added: infolog.txt | |
| 2013-11-16 14:35 | silentwings | Note Edited: 0012147 | |
| 2013-11-16 14:36 | silentwings | Note Edited: 0012147 | |
| 2013-11-16 14:47 | silentwings | Note Edited: 0012147 | |
| 2013-11-16 16:37 | silentwings | Note Edited: 0012147 | |
| 2013-11-17 00:43 |
|
File Added: crash 2infolog.txt | |
| 2013-11-17 00:43 |
|
File Added: crash 3infolog.txt | |
| 2013-11-17 00:45 |
|
Note Added: 0012148 | |
| 2013-11-17 01:08 | silentwings | Note Added: 0012149 | |
| 2013-11-17 01:21 | silentwings | Note Edited: 0012149 | |
| 2013-11-17 01:24 | silentwings | Note Edited: 0012149 | |
| 2013-11-17 01:46 | Kloot | Note Added: 0012150 | |
| 2013-11-17 01:51 | Kloot | Note Edited: 0012150 | |
| 2013-11-17 02:01 | Kloot | Note Edited: 0012150 | |
| 2013-11-17 02:16 | silentwings | Note Added: 0012151 | |
| 2013-11-17 02:17 | silentwings | File Added: springsettings_B.cfg | |
| 2013-11-17 02:17 | silentwings | File Added: springsettings_A.cfg | |
| 2013-11-17 02:19 | silentwings | Note Edited: 0012151 | |
| 2013-11-17 02:19 | silentwings | Note Edited: 0012151 | |
| 2013-11-17 02:59 | Kloot | Note Added: 0012153 | |
| 2013-11-17 03:02 | Kloot | Note Edited: 0012153 | |
| 2013-11-17 03:03 | Kloot | Note Edited: 0012153 | |
| 2013-11-17 03:05 | silentwings | Note Added: 0012154 | |
| 2013-11-17 03:05 | silentwings | Note Edited: 0012154 | |
| 2013-11-17 03:06 | silentwings | Note Edited: 0012154 | |
| 2013-11-17 03:07 | silentwings | Note Edited: 0012154 | |
| 2013-11-17 03:18 | Kloot | Note Added: 0012156 | |
| 2013-11-17 03:19 | Kloot | Note Edited: 0012156 | |
| 2013-11-17 03:21 | Kloot | Note Edited: 0012156 | |
| 2013-11-17 03:24 | Kloot | Note Edited: 0012156 | |
| 2013-11-17 14:14 | silentwings | Note Added: 0012161 | |
| 2013-11-17 14:15 | silentwings | Note Edited: 0012161 | |
| 2013-11-17 15:01 | Kloot | Note Edited: 0012153 | |
| 2013-11-17 15:03 | Kloot | Note Added: 0012162 | |
| 2013-11-17 15:04 | Kloot | Note Edited: 0012162 | |
| 2013-11-17 15:05 | Kloot | Note Edited: 0012162 | |
| 2013-11-17 15:33 | nixtux | Note Added: 0012163 | |
| 2013-11-17 18:24 | silentwings | Note Added: 0012168 | |
| 2013-11-17 19:10 | nixtux | Note Added: 0012171 | |
| 2013-11-21 01:52 | abma | Target Version | => 96.0 |
| 2013-11-26 15:45 | silentwings | Note Added: 0012212 | |
| 2013-11-26 18:34 | monohouse | File Added: infolog-manoa.txt | |
| 2013-11-26 18:34 | monohouse | File Added: 20131126_193517_Knockout_v2_95.sdf | |
| 2013-11-26 18:37 | monohouse | Note Added: 0012218 | |
| 2013-11-26 18:37 | monohouse | File Added: springsettings-nvidia.cfg | |
| 2013-11-27 22:39 | Kloot | Note Added: 0012228 | |
| 2013-11-27 22:39 | Kloot | Note Edited: 0012228 | |
| 2013-11-27 22:40 | Kloot | Note Edited: 0012228 | |
| 2013-11-28 02:08 | silentwings | Note Added: 0012235 | |
| 2013-11-28 03:33 |
|
Note Added: 0012236 | |
| 2013-11-28 04:22 | silentwings | Note Added: 0012237 | |
| 2013-11-28 04:23 | silentwings | Note Edited: 0012237 | |
| 2013-11-28 04:24 | silentwings | Note Edited: 0012237 | |
| 2013-11-28 16:11 | silentwings | Note Added: 0012253 | |
| 2013-11-28 17:58 | Kloot | Note Added: 0012256 | |
| 2013-11-28 18:04 | Kloot | Note Edited: 0012256 | |
| 2013-11-28 18:09 | Kloot | Note Edited: 0012256 | |
| 2013-11-30 16:05 | silentwings | Note Added: 0012276 | |
| 2013-11-30 16:06 | silentwings | Note Edited: 0012276 | |
| 2013-11-30 16:06 | silentwings | Note Edited: 0012276 | |
| 2013-11-30 16:08 | silentwings | Note Edited: 0012276 | |
| 2013-11-30 16:09 | silentwings | Note Edited: 0012276 | |
| 2013-11-30 16:11 | silentwings | Note Edited: 0012276 | |
| 2013-11-30 16:12 | silentwings | Note Edited: 0012276 | |
| 2013-11-30 20:11 | Kloot | Note Added: 0012278 | |
| 2013-11-30 20:32 | Kloot | Note Edited: 0012278 | |
| 2013-11-30 23:00 | silentwings | Note Added: 0012279 | |
| 2013-11-30 23:00 | silentwings | Note Edited: 0012279 | |
| 2013-11-30 23:52 | Kloot | Note Added: 0012280 | |
| 2013-11-30 23:52 | Kloot | Status | new => resolved |
| 2013-11-30 23:52 | Kloot | Fixed in Version | => 95.0.1+git |
| 2013-11-30 23:52 | Kloot | Resolution | open => fixed |
| 2013-11-30 23:52 | Kloot | Assigned To | => Kloot |