Notes about performance, 105.0 vs 106.0 vs bar105 engines

Notes about performance, 105.0 vs 106.0 vs bar105 engines

Discuss the source code and development of Spring Engine in general from a technical point of view. Patches go here too.

Moderator: Moderators

Post Reply
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

I've done some basic testing to compare performance of 105.0 with the latest official engine release and the latest BAR105 engine release.

MF v1.80
comet catcher redux v2
disable luaui, disable luarules
(use tab hotkey for zoom out top view)
vsync off


--- spring_105.0_windows-64
spawn 5000 aven_magnum a bit north from the center (they take up most of the map)
zoomed out top view (icons) : ~100 fps, ~0.9 speed
zoomed in a bit until the models appear (but with as many models visible as possible) : 4 fps, ~0.3 speed
zoom out top view (icons) then order them to move south : freezes for a second then drops to ~10 fps for a few seconds then to ~90 fps, ~0.15 speed
~3.05 GB ram usage

--- BAR105.105.1.1-784_windows-64
spawn 5000 aven_magnum a bit north from the center (they take up most of the map)
zoomed out top view (icons) : ~100 fps, 1.0 speed
zoomed in a bit until the models appear (but with as many models visible as possible) : 22 fps, 1.0 speed
zoom out top view (icons) then order them to move south : freezes for a second then drops to ~10 fps for a few seconds then to ~90fps, ~0.15 speed
~3.75 GB ram usage

NOTE: a few days ago I had tested BAR105.105.1.1-769 and it had about 2/3 as much fps when zoomed out and showing icons, that issue has been fixed by Ivand and now it has about the same fps in my pc

--- spring_106.0_windows-64
spawn 5000 aven_magnum a bit north from the center (they take up most of the map)
zoomed out top view (icons) : ~160 fps, ~0.9 speed
zoomed in a bit until the models appear (but with as many models visible as possible) : 30 fps, ~0.82 speed
zoom out top view (icons) then order them to move south : freezes for a second then drops to ~10 fps for a few seconds then to ~120fps, ~0.15 speed
~3.35 GB ram usage

NOTE: i wrote "freeze" when i try to move the units, but it's more like a ~1s delay between issuing the order and them moving



There are some differences and the 106.0 seems faster than the BAR105, but breaks compatibility. The performance differences between BAR105 and 106.0 may be due to feature differences.

A key difference is that I get 5x FPS or more from either 106.0 or BAR105 relative to 105.0 when just looking at a scene with hundreds of units/features, nice!
Last edited by raaar on 19 Jan 2022, 22:45, edited 1 time in total.
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by Beherith »

Did you run it with /luaui disable, out of curiosity?
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

yes, otherwise i'd be unable to compare with 106.0 as it breaks with my lua code.
disable luaui, disable luarules
Shruggoth
Posts: 1
Joined: 13 Apr 2021, 04:52

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by Shruggoth »

I ran a similar test.
As an extra note in 106 I did /grounddetail 140 as MF auto-sets that upon initialization via widget.

--- spring_105.0_windows-64
zoomed out tab view (icons) : 45 fps
zoomed in until models just filled the screen: 3 fps
zoom out tab view (icons) then order them to move south : 1s freeze, few seconds of ~1fps, then ~10fps

--- spring_bar_.BAR105.105.1.1-769-g56f63fc_windows-64
zoomed out tab view (icons) : 32 fps
zoomed in until models just filled the screen: 15 fps
zoom out tab view (icons) then order them to move south : ~10fps (not jumpy like 105)

--- spring_106.0_windows-64
zoomed out tab view (icons) : 43 fps
zoomed in until models just filled the screen: ~23 fps
zoom out tab view (icons) then order them to move south : ~9 fps (not jumpy like 105)

Similar looking performance, but bar105 and 106 lack the hitching in movement like raaar experienced.
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

Another test, checking ram usage

MF v1.80
comet catcher redux v2
disable luaui, disable luarules

--- 105.0
2.75 GB

--- BAR105-769
3.42 GB

--- 106.0
3.06 GB


the new engines seem to consume more ram than 105.0, at least when the game starts, and BAR105 consumes the most.

EDIT: another player tested and got different results, but still showed more memory usage on the newer engines:
[19:20:07] <Shruggoth> ima check what my RAM does
[19:20:58] <Shruggoth> 2.5GiB on CCR, spring 105
[19:21:21] <Shruggoth> how much RAM do you have, ximes?
[19:21:37] <Shruggoth> ooh
[19:21:39] <Shruggoth> 3.1GiB on 769
[19:21:50] <Shruggoth> so it demands an extra 600MiB for me
[19:22:35] <Shruggoth> Somehow
[19:22:41] <Shruggoth> 106 uses 4.2GiB
Last edited by raaar on 16 Jan 2022, 21:06, edited 2 times in total.
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by Beherith »

The new engines are x64, that naturally consumes more ram :)
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

hmm, but isn't 105.0 x64 as well?

Another thing i failed to notice on my first test is how much spring slowed down the game. I recorded fps, but not game speed.
SeanHeron
Engines Of War Developer
Posts: 614
Joined: 09 Jun 2005, 23:39

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by SeanHeron »

Thanks for doing this and describing your results!

Given what you describe, at least for me it'll prob be things other than performance that I'd mostly be considering (as I've said to some people in private already) if I need to make the choice (for example compatibility issues as you mentioned; or my guestimate of support and future development, etc. - though thats obviously a whole different can of worms).

Still, definitely nice to have some semi-objective numbers out there :).
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

Thanks, these are basic tests though. It'd be nice to do proper benchmarking.

I've updated the tests to use "vsync off", show the speed multipliers and ram usage and to use the latest BAR105 release which has a fix for a performance issue with rendering icons.
TarnishedKnight
Posts: 9
Joined: 13 Jun 2022, 17:39

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by TarnishedKnight »

It has been about year since you run the tests, raaar. I would be interested to see how your performance tests show how the engines are doing now.
ivand
Posts: 310
Joined: 27 Jun 2007, 17:05

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by ivand »

I doubt rendering performance changed that much. BAR105 is probably bottlenecked by the terrain rendering that no one touched since 105 times. This can be looked at, but we are not in the "highest FPS wins" business:

No one (hopefully) runs a game without LuaUI and additional effects, once these are commissioned BAR105 should see much lesser impact on FPS than competitors, just because the way we architectured things: engine model buffers and transformation matrices are always exposed on GPU, available for use from Lua Shader and basically "free" to use; all UI geometry is uploaded once and reused afterwards.

Just have a look at the number of effects BAR as game has and FPS it shows with all these effects in place. They all are made in Lua and cost peanuts to execute. For example unit headlights https://youtu.be/GEQmm4UhLAg?t=21 rely on choosing the lights position and direction solely inside a shader. No need to run expensive GetUnitPiecePosition/GetUnitPieceDirection/GetUnitPosition/etc. per unit.

Because we stress PCIe bus with uploading all matrix data to the GPU we may never reach "1800 FPS" level of 106.0, but we will always be better when a game will grow good amount of Lua "meat" around it. This assumes of course a game dev put efforts into using modern GL4 API and not sticking to old ways of doing things (which still exist, but perform on the same level as on 105.0 engine).

As far as sim is concerned BAR105 should blow anything else out of the water. Multi-threaded pathfinding is novel in BAR105, so as multi-threaded collision handling. And these two items had the highest computation cost in late game scenarios.
User avatar
Teifion
Posts: 22
Joined: 10 Mar 2021, 00:10

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by Teifion »

How representative of engine performance is it to just spawn in a bunch of units and look at them vs running a replay of big battles?
ivand
Posts: 310
Joined: 27 Jun 2007, 17:05

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by ivand »

There's nothing to look at in the sense that a 106 game doesn't exist (at least something I would call a game). And even if it existed it wouldn't work on BAR105 or 105.
So the least common denominator is to spawn a bunch of units with disabled LuaUI and LuaRules. In case their unitscripts are in COB/BOS they can even move and shoot.

This can be best described as baseline performance. But it won't give a single clue how fast the perf is going to deteriorate in case gfx & UI widgets/gadgets are added. This is as far as rendering is concerned.

Sim should be better with BAR105 because two perf critical pieces of code were MTed.
TarnishedKnight
Posts: 9
Joined: 13 Jun 2022, 17:39

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by TarnishedKnight »

It's just a bit of fun. Relative figures are always nice to see. Who knows? It may show a regression, which would be good to know about.
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

I've done some testing setting up spring on a desktop pc nearby
ubuntu 22.0.4
gpu nvidia geforce gt 1030 (GL 4.3 compatibility)
8 gb ram
amd 1-10 7800 radeon r7 cpu


kind of a toaster by today's standards. With 1920x1080 fullscreen and spawning 100 aven_magnum on MF v2.00 on DSD, I get 15-20 fps on both 105.0, BAR105 1478 and 106.0, even with details set to low and luaui disabled.

In MF terms, "low" settings means

Spring.SendCommands("disticon 130")
Spring.SendCommands("water 0")
Spring.SendCommands("shadows 0")
Spring.SetConfigInt("ShadowMapSize",0,false)
Spring.SendCommands("softparticles 0")
Spring.SetConfigInt("MaxParticles",20000,false)
Spring.SetConfigInt("MaxNanoParticles",10000,false)
Spring.SendCommands("grounddetail 100")
Spring.SetConfigInt("GroundDecals",0,false)
Spring.SetConfigInt("GroundScarAlphaFade",0,false)
Spring.SetConfigFloat("snd_airAbsorption",0.0,false)
Spring.SetConfigInt("UseSDLAudio",1,false)
Spring.SetConfigInt("UseEFX",0,false)
Spring.SetConfigInt("DynamicSky",0,false)
Spring.SetConfigInt("GrassDetail",0,false)
Spring.SetConfigInt("3DTrees",0,false)
Spring.SetConfigInt("AdvMapShading",0,false)
Spring.SetConfigInt("AdvUnitShading",0,false)
Spring.SetConfigInt("CompressTextures",1,false)
Spring.SetConfigInt("HighResInfoTexture",0,false)
Spring.SetConfigInt("LuaShaders",1,false)
Spring.SetConfigInt("ROAM",2,false)
Spring.SetConfigInt("MSAALevel",0,false)


It seems the new versions don't do much to help this toaster :/
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

I've had people tell me the new version runs slow on linux.

Ximes (old laptop with 2 cpu cores (4 logical), 4 GB ram and gentoo linux) built BAR105 1478 locally and he's able to play, although sometimes runs out of memory.

He noticed a strange issue:

[.....:32] <ximes> <raaar> so what settings did you use when you got the first one, https://imgur.com/a/WNzSBp3 <<< here I didn't modify any setCoreAffinity setting
[17:17:18] <ximes> then
[17:17:21] <ximes> <ximes> https://imgur.com/a/7933MIy <<< after I set the mask manually with the command "taskset -a -p 0f 14944"
[17:18:26] <ximes> the only settings that may be interfering with it are:
[17:18:26] <ximes> PathingThreadCount = 4
[17:18:27] <ximes> WorkerThreadCount = 4
[17:18:35] <ximes> I would have to test it again to be sure
[17:18:54] <ximes> 1 min
[17:20:42] <ximes> [14:18:27] <ximes> PathingThreadCount = 4
[17:20:42] <ximes> [14:18:27] <ximes> WorkerThreadCount = 4
[17:21:04] <ximes> <<< nope, I get the same problem.... mask set to 0b1000.... all threads running on cpu #3
[17:22:41] <ximes> tbh... nowadays no app sets core affinity without a very strong reason... spring engine may have done it in the past, but it got somehow broken now

Regardless of whether he set or not the "setCoreAffinity" setting, all threads would end up running on one cpu. He said after changing the cpu affinity to use the other cores, performance improved a bit (like +10-50% fps on a sandbox watching 150 units)

I couldn't reproduce the issue on the computer i mentioned in the previous post with ubuntu os, there i get affinity flags 1,2,4 and f on the spring threads, mostly f.

EDIT: seems issue had been talked about on BAR discord and there was an attempted fix on 8 jan (https://github.com/beyond-all-reason/spring/issues/575)

EDIT : after some testing it seems springlobby is to blame : I get more variety of affinity masks for various threads when running either the bar105 or the 105 binary directly and got all "8" and one "f" twice when running from springlobby, but it's inconsistent.
TarnishedKnight
Posts: 9
Joined: 13 Jun 2022, 17:39

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by TarnishedKnight »

If you send me a copy of an infolog where all the threads were on one core, I'll take a look and see if there's anything obvious engine side.
raaar
Metal Factions Developer
Posts: 1095
Joined: 20 Feb 2010, 12:17

Re: Notes about performance, 105.0 vs 106.0 vs bar105 engines

Post by raaar »

I tested starting spring 105 and bar105 1478 from skylobby, from springlobby and directly 5 times each and checked thread info using
ps --pid PID -O tid,lwp,nlwp,%cpu,psr -L
taskset --all-tasks -p PID

mf v2.02, lonely oasis v1.1 map

----- skylobby 0.9.26
skylobby itself has mask "f" and its threads get assigned to cpus 0 to 3

- 105.0
affinity masks 1,2,c,f (mostly f) threads assigned to various cores (1 core 90%)

- bar105 1478
affinity masks 1,2,4,f (mostly f) threads assigned to various cores (1 core at 50%)

----- springlobby 0.274
springlobby's threads have masks 8 and f and are assigned cpus 0 to 3
apparently when i run spring it adds a thread with mask 8

- 105.0
affinity masks 1,2,c,f (mostly f) threads assigned to various cores (1 core at 90%)

- bar105 1478
affinity masks all 8 except f on 3rd all threads on core 3

----- running the spring binary directly

- 105.0
affinity masks 1,2,c,f threads assigned to various cores

- bar105 1478
affinity masks 1,2,4,f (mostly f) threads assigned to various cores


So indeed it seems that starting bar105 1478 from springlobby has the "all in 1 core" issue but 105.0 doesn't.
infolog in attachment
Attachments
infolog_starting_from_springlobby.txt
(57.67 KiB) Downloaded 144 times
Post Reply

Return to “Engine”