just another oprofile run

just another oprofile run

Discuss the source code and development of Spring Engine in general from a technical point of view. Patches go here too.

Moderator: Moderators

Post Reply
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

just another oprofile run

Post by jK »

I also marked some very interesting time eaters (in total they make ~20%)

Notes:
  • LocalModelPiece::Draw fixing needs advanced opengl knowledge (vbo, ubo, tbo, shaders)
  • CMinimap::DrawUnit is quite easy to fix (it wastes all its time on glBindTexture)
  • CSolidObject::Block is slow cause it is managed via many std::maps instead of a single container
  • AllowUnitBuildStep & QueryNanoPiece could likely merged somehow or even skipped for most units (e.g. via adding a new unitdef tag/UnitScript function for using a fixed NanoPiece)
  • QTPFS was used, Legacy one is twice slower
Sim was a 15mins ZK 3v3 CAI on crossing4final with `gameplay zoom` (not full map visible) and ~40fps.
Attachments
profile_.png
(2.83 MiB) Downloaded 5 times
User avatar
smoth
Posts: 22309
Joined: 13 Jan 2005, 00:46

Re: just another oprofile run

Post by smoth »

nice
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: just another oprofile run

Post by Beherith »

I loooove profiles. Could you run one for BA if I gave you a replay?
Kloot
Spring Developer
Posts: 1867
Joined: 08 Oct 2006, 16:58

Re: just another oprofile run

Post by Kloot »

the subtree labeled QTPFS is actually filled with legacy-PFS functions :!:
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Re: just another oprofile run

Post by jK »

Beherith wrote:I loooove profiles. Could you run one for BA if I gave you a replay?
np
Kloot wrote:the subtree labeled QTPFS is actually filled with legacy-PFS functions :!:
Damn my fault. The mod option tag is case-sensitive ...
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: just another oprofile run

Post by Beherith »

Thanks man, a large dworld FFA replay:
Attachments
20120917_221156_Dworld_V1_91.7z
(1.05 MiB) Downloaded 27 times
Google_Frog
Moderator
Posts: 2464
Joined: 12 Oct 2007, 09:24

Re: just another oprofile run

Post by Google_Frog »

ZK uses AllowBuildStep in a few places and it can't be removed. Also at a glance there is little optimisation to do lua-side.

How is the cost of AllowWeaponTarget? It is now used in ZK to completely rewrite weapon target priorities so it would be nice to see it's impact.
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Re: just another oprofile run

Post by jK »

Beherith wrote:Thanks man, a large dworld FFA replay:
CSolidObject::Block + CMinimap::DrawUnit ~= 15%
Attachments
profile.png
(2.51 MiB) Downloaded 3 times
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Re: just another oprofile run

Post by jK »

Kloot wrote:the subtree labeled QTPFS is actually filled with legacy-PFS functions :!:
k QTPFS has a (massive) problem with my setup

>80% cpu usage after a few minutes & 100% reproduce rate with the script.txt
Attachments
script_benchmark.txt
(3.1 KiB) Downloaded 34 times
profile.png
(1.32 MiB) Downloaded 3 times
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: just another oprofile run

Post by Beherith »

Thanks jK! It seems we are even more expensive in allowunitbuildstep than ZK. BA does use a lot of nanos....
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Re: just another oprofile run

Post by jK »

QTPFS update

Some things I noticed:
1. QTPFS causes cpu spikes at the beginning (it locks cpu >100ms), may be it is possible to share work across multiple gameframes somehow? e.g. calculating the path iterations could happen in stages?
2. is it possible that stucked units cause massive cpu load?
3. if 2nd issue is not true, later the spikes accumulate to a constant lag so the total cpu usage needs to be decreased too

edit:
I recompiled with RELEASE and noticed it runs MUCH smoother there (note, I cannot create profile graphs with such a build). Seems something in DEBUG2 & PROFILE multiplies the load (some massive recursive functions?). Still QTPFS makes 50% of SimLoad, esp. the spikes are still there. So (1) & (2) are still valid.
Attachments
profile.png
(711.97 KiB) Downloaded 3 times
SirMaverick
Posts: 834
Joined: 19 May 2009, 21:10

Re: just another oprofile run

Post by SirMaverick »

BA 7.72, 15 vehicle constructors.
Everytime you place a new building there is a small spike after the building is created. Does not happen with using just a single builder.
Attachments
qtpfs.sdf
(26.55 KiB) Downloaded 22 times
Kloot
Spring Developer
Posts: 1867
Joined: 08 Oct 2006, 16:58

Re: just another oprofile run

Post by Kloot »

jK wrote:QTPFS update

Some things I noticed:
1. QTPFS causes cpu spikes at the beginning (it locks cpu >100ms), may be it is possible to share work across multiple gameframes somehow? e.g. calculating the path iterations could happen in stages?
2. is it possible that stucked units cause massive cpu load?
3. if 2nd issue is not true, later the spikes accumulate to a constant lag so the total cpu usage needs to be decreased too
1. I considered that, but it's hard to do because each search would need its own context (plus memory) and terrain deformations could invalidate any partial results. In any case the searches themselves are not so expensive, but the extra work being done during each search (updating neighbor references for every explored node) was, which should now be fixed by d51db40a9f0. Might need a new oprofile run though.
2. Unlikely, but units trying to find a path *to* unreachable nodes could cause spikes and AI's often spam such orders.
edit:
I recompiled with RELEASE and noticed it runs MUCH smoother there (note, I cannot create profile graphs with such a build). Seems something in DEBUG2 & PROFILE multiplies the load (some massive recursive functions?). Still QTPFS makes 50% of SimLoad, esp. the spikes are still there. So (1) & (2) are still valid.
QTPFS has its own special debugging #definitions (DEBUG(2) only enables some trivial asserts), so I assume the code is just very cache-inefficient without compiler optimizations.
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: just another oprofile run

Post by Beherith »

jK, what springsettings did you use for the profile run? Some things that I thought were more expensive didnt show up.
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Re: just another oprofile run

Post by jK »

Code: Select all

AllowAdditionalPlayers = 1
FSAALevel = 8
Fullscreen = 0
GroundDetail = 84
HardwareCursor = 1
LinkIncomingPeakBandwidth = 0
LinkIncomingSustainedBandwidth = 0
LinkOutgoingBandwidth = 0
ReflectiveWater = 4
ShadowMapSize = 4096
Shadows = 1
ShowFPS = 1
ShowPlayerInfo = 0
ShowSpeed = 1
WindowedEdgeMove = 0
User avatar
Beherith
Posts: 5145
Joined: 26 Oct 2007, 16:21

Re: just another oprofile run

Post by Beherith »

Thanks man, I suspect BuildingGrounddecals are a bigger load than they seem. Though your config and the default
decalLevel = std::max(0, configHandler->GetInt("GroundDecals"));
makes it seem decals were off.
User avatar
jK
Spring Developer
Posts: 2299
Joined: 28 Jun 2007, 07:30

Re: just another oprofile run

Post by jK »

ah no, they are on, but I disable them from time to time cause I write a shader replacement currently.
Post Reply

Return to “Engine”