Page 1 of 2

Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 09:52
by Beherith
In this example, I performance tested 500 corraid tanks from BAR (4 pieces per model).
My setup is an old i5 750 paired with a gtx970. I am far from GPU limited in either case.

Without customunitshaders 72 FPS

Image

With customunitshaders this drops to 41 FPS. (normalmapping only, nothing in Drawunit Call (just return), no setting of uniforms or anything, just the extra texture)

Image

Is there anything I am doing wrong to get this massive performance impact from from CUS?

Is there a point to using CUS on features such as trees for vertex shading and normal mapping, when they easily number hundreds on screen?

Engine level normal maps are practically free, if an agreed-upon convention for assigning normal maps to models can be implemented. See viewtopic.php?f=12&t=34024


EDIT: so correctly writing down your problems is a solution to them (partly). With the above test, the DrawUnit call had nothing in it, just return false end.
After removing the Drawunit = Drawunit assignment from the material definition itself, the fps goes back up to 62. While retaining the normal mapping (as that is not on a per unit, but a per-material bases).

But doing things that require uniforms to be set on a per-unit basis (e.g. nearly everything that one would do with CUS, vertex animation, flashing lights, custom blending colors, etc), is still very expensive.

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 10:34
by Kloot
nothing in Drawunit Call
As I have said before: it doesn't really matter what's *in* DrawUnit, the callin itself is the problem...

One solution would be to use caching, e.g.

Code: Select all

local unitUniformData = {}

function gadget:DrawUnit(unitID)
  glUniform(unitUniformData[unitID].loc, unitUniformData[unitID].val)
  spUnitRenderingSetUnitLuaDraw(unitID, false)
end

function gadget:SomeFunction(unitID)
  -- assume this callin requires updating uniforms
  local unitUniforms = unitUniformData[unitID] or {}
  unitUniforms.loc = ...
  unitUniforms.val = ...
  unitUniforms.frame = unitUniforms.frame or 0

  if ((spGetGameFrame() - unitUniforms.frame) > N) then
    unitUniforms.frame = spGetGameFrame()
    spUnitRenderingSetUnitLuaDraw(unitID, true)
  end
end
with some sensible value of N.

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 10:51
by Beherith
Ok, so just to be clear, if I set a unit health uniform, then call spUnitRenderingSetUnitLuaDraw(unitID, false), then that uniform value will be passed as the same as I last set it for all subsequent frames that unit is drawn in? Eg I could just watch all the healths and change as needed?
This would be great for setting the unitID as a random seed.

What about uniform values that change every frame, but are the same for every unit sharing that material, e.g. gameframe?

One can already do some neat things with unitID seed and gameframe :D

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 10:57
by gajop
A lot of times DrawUnit really just needs to do the update of the time (frame) uniform and use it with the unitID/featureID (which are constant for the unit and could probably be set in the shader beforehand).
It would be really beneficial if such a time uniform would be available globally (provided by the engine custom material code) in all shaders without having to update it via DrawUnit/DrawFeature calls for every object. I don't know enough about shaders if global/shared uniforms are a thing.

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 11:04
by Kloot
Beherith wrote:if I set a unit health uniform, then call spUnitRenderingSetUnitLuaDraw(unitID, false), then that uniform value will be passed as the same as I last set it for all subsequent frames that unit is drawn in?
Yes, the GPU will cache the last value of each uniform declared in a shader.
Beherith wrote:I could just watch all the healths and change as needed?
That's the approach you should always take, uniforms should be considered semi-constants and only updated when absolutely necessary.
Beherith wrote:What about uniform values that change every frame, but are the same for every unit sharing that material, e.g. gameframe?
If the shader is shared, then you only need to set the new value for *one* of the units that uses it.
gajop wrote:It would be really beneficial if such a time uniform would be available globally (provided by the engine custom material code)
That code already exists (for parameters like id/speed/health/time/etc), but was never finished by trepan and I got lazy.

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 11:26
by gajop
Kloot wrote:That's the approach you should always take, uniforms should be considered semi-constants and only updated when absolutely necessary.
Indeed. It's something I learned recently when doing the sun lighting thing in the engine. Changed my perspective on shaders quite a bit.
Kloot wrote:If the shader is shared, then you only need to set the new value for *one* of the units that uses it.
Is it possible to share them efficiently if you have different unitID constants for each unit? Wouldn't you have to change the ID for each unit? Same for hp, and other instance data.
Kloot wrote:That code already exists (for parameters like id/speed/health/time/etc), but was never finished by trepan and I got lazy.
Cool. That's the next thing on my wishlist then. It's probably required for efficient & good looking time-based animations (doing it every Nth frame probably gets choppy soon after N>5).

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 11:29
by Beherith
Kloot wrote:
Beherith wrote:I could just watch all the healths and change as needed?
That's the approach you should always take, uniforms should be considered semi-constants and only updated when absolutely necessary.
Beherith wrote:What about uniform values that change every frame, but are the same for every unit sharing that material, e.g. gameframe?
If the shader is shared, then you only need to set the new value for *one* of the units that uses it.
So the above two use cases are mutually exclusive. Because all units share the same material (shader), if I update health for one of them, it will change for all of them. Should each unit have separate material?


Am I interpreting the above correctly?

One material for all units
1. Frameloc: cached, great
2. Healthloc: must be changed for each unit
3. unitIDloc: must be changed for each unit
Separate material for each unitID
1. Frameloc: must be changed for each unit
2. healthloc: great, cached
3. unitIDloc: Great cached.

Also, is there anything I can do about the remaining 20% fps drop from 74 to 62, with all drawunit calls disabled?

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 12:22
by Kloot
Now you know why clever state management is so important in modern engines.
  • If you have a shared material, but N>0 per-object parameters (health, speed, ...), you need Draw{Unit,Feature} for all objects sharing it.
  • If you have a per-object material, but only common parameters (simframe, sundir, ...), you should use a shared material.
In both cases, you can and should still exploit the fact that simulation data will not change between *sim*frames and usually even less frequently.

Fortunately, it shouldn't take much to allow materials to register uniforms they want to have automatically updated by the engine (like health) so Draw{Unit,Feature} can be bypassed entirely.
Beherith wrote:Am I interpreting the above correctly?

One material for all units
1. Frameloc: cached, great
2. Healthloc: must be changed for each unit
3. unitIDloc: must be changed for each unit
Separate material for each unitID
1. Frameloc: must be changed for each unit
2. healthloc: great, cached
3. unitIDloc: Great cached.
Yup.
Beherith wrote:Should each unit have separate material?
Minimize dependence on Draw{Unit,Feature} AMAP, but also group objects where you can to avoid material sorting/switching costs. Finding the right tradeoff might be painful, but TNSTAAFL.
Beherith wrote: Also, is there anything I can do about the remaining 20% fps drop from 74 to 62?
Assuming you have optimized your shaders, probably not.

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 13:34
by Beherith
Thank you for the concise explanation, Kloot!

The engine-level id/speed/health/time uniforms would be quite the treat.

Also, if I wish to update, say the gameframe once per material, is putting a function in the material's predl field a sensible solution?

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 13:58
by Kloot
No; your function would be called only once (and its return value cached) when the list gets compiled.

Re: Even the most basic custom unit shaders are very expensive

Posted: 08 Mar 2016, 14:25
by gajop
Beherith wrote:Thank you for the concise explanation, Kloot!
+1
Beherith wrote:Also, if I wish to update, say the gameframe once per material, is putting a function in the material's predl field a sensible solution?
We can just extend it with some other callins, it already supports UnitCreated and UnitDestroyed along with DrawUnit, so I don't see a problem of adding GameFrame (and all other callins available for unsynced gadgets for that matter).

WRT per-object materials, is there any limit to how many we can have? What is the memory/performance cost of different materials? Wouldn't rendering them cause shaders to switch, which might be slow or?

Re: Even the most basic custom unit shaders are very expensive

Posted: 09 Mar 2016, 12:03
by Beherith
Gajop, I dont think there is a hard limit on the amount of per-object materials you can have, but i'm certain that switching materials costs more than changing a uniform.

Also, that 20% perf loss, I did some profiling on engine version 101.0-63 (its what I had on hand). I gave 500 corraid on BAR and toggled CUS.

It seems that the cost comes from the individual setting of teamcolor uniforms in

Code: Select all

<LuaObjectDrawer.cpp> m->SetUniformData(LuaObjectUniforms::UNIFORM_TCOLOR, std::move(IUnitDrawer...
C:/spring101_mine/spring-101.0.1-63-g5ebf047/sauce/spring/rts/Rendering/LuaObjectDrawer.cpp:189
Is this even fixable?

If anyone is interested, I will gladly post the full results of the profiler

EDIT: Kloot is my hero! https://github.com/spring/spring/commit ... f6613cc5ad

Re: Even the most basic custom unit shaders are very expensive

Posted: 09 Mar 2016, 13:44
by Kloot
Can you retest with the https://springrts.com/dl/buildbot/defau ... -g9ea3b9b/ build?

note: this changes "/give 500 corraid" from worst to best case (500 to 1 updates per pass), but in real games the ordering of objects will be less favorable so you are unlikely to see the same gain.

Re: Even the most basic custom unit shaders are very expensive

Posted: 09 Mar 2016, 20:54
by Beherith
Yes, that was the exact cause of the overhead. CUS and non-CUS are now identical in performance in this edge test case. Thank you very much!

Re: Even the most basic custom unit shaders are very expensive

Posted: 15 Mar 2016, 03:54
by Beherith
Kloot, thank you very much for all of your help with CUS, it wouldnt have been remotely possible without you.

I applied the last patch you attacked to my mantis ticket https://springrts.com/mantis/view.php?id=5158, but didnt want to reopen the ticket again (do note if that should have been my course of action).

If there is any change in the value of vec4 vertex in the pixel shader, the fragment is fully shadowed. This is only visible after the game starts, and the FrameLoc starts counting.

It can also be reproduced by pausing the game, then /luarules reload. The trees have correct shadows, but when you unpause, they become fully self-shadowed.

Also, is there any specific path you think we should implement per material frameLoc updating?

Before unpausing:
Image

After unpausing:

Image

Re: Even the most basic custom unit shaders are very expensive

Posted: 15 Mar 2016, 12:02
by Kloot
frameLoc can (and will soon) be an engine-side uniform.

Did you look into setting up a shadow material? Self-shadowing is easy to break when moving vertices.

Alternatively you could use the unperturbed vertex positions to generate shadowtex coordinates.

Re: Even the most basic custom unit shaders are very expensive

Posted: 15 Mar 2016, 13:30
by Beherith
I tried using the unperturbed vertex coords to generate the shadow coords, but that had the same result. Note that when the game is paused, the frameLoc is already not zero, the breakage happens when I unpause the game.

I have not yet looked into adding a shadow material.

Re: Even the most basic custom unit shaders are very expensive

Posted: 16 Mar 2016, 03:15
by Kloot
As it happens this was both a framework and an engine issue (obvious in hindsight).

Engine side is now fixed, BAR's framework just needs to have the attached patch applied.

HF testing.

edit #1: ignore the Spring.Echo diffs, wanted to kill infolog spam and forgot to revert them.

edit #2: plus a fix for the arm_tanks material

Code: Select all

Index: ModelMaterials/2_arm_tanks.lua
===================================================================
--- ModelMaterials/2_arm_tanks.lua	(revision 5212)
+++ ModelMaterials/2_arm_tanks.lua	(working copy)
@@ -12,9 +12,15 @@
 
 local GADGET_DIR = "LuaRules/Configs/"
 
+local etcLoc = -2
+
 local function DrawUnit(unitID, material,drawMode)
 	-- Spring.Echo('Arm Tanks drawmode',drawMode) 
 	if (drawMode ==1)then -- we can skip setting the uniforms as they only affect fragment color, not fragment alpha or vertex positions, so they dont have an effect on shadows, and drawmode 2 is shadows, 1 is normal mode.
+		if (etcLoc == -2) then
+			etcLoc = gl.GetUniformLocation(material.standardShader, "etcLoc")
+		end
+
 		--Spring.Echo('drawing',UnitDefs[Spring.GetUnitDefID(unitID)].name,GetGameFrame())
 		local  health,maxhealth=GetUnitHealth(unitID)
 		health= 2*maximum(0, (-2*health)/(maxhealth)+1) --inverse of health, 0 if health is 100%-50%, goes to 1 by 0 health
@@ -21,7 +27,7 @@
 		local _ , _ , _ , speed = Spring.GetUnitVelocity(unitID)
 		if speed >0.01 then speed =1 end
 		local offset= (((GetGameFrame())%9) * (2.0/4096.0))*speed 
-		glUniform(material.etcLoc, 2* maximum(0,sine((unitID%10)+GetGameFrame()/((unitID%7)+6))), health,offset) --etcloc.z is the track offset pos.
+		glUniform(etcLoc, 2* maximum(0,sine((unitID%10)+GetGameFrame()/((unitID%7)+6))), health,offset) --etcloc.z is the track offset pos.
 
 	end
   --// engine should still draw it (we just set the uniforms for the shader)

Re: Even the most basic custom unit shaders are very expensive

Posted: 16 Mar 2016, 09:50
by hokomoko
it currently spews countless errors for arm tanks since etcLoc isn't set.

Re: Even the most basic custom unit shaders are very expensive

Posted: 16 Mar 2016, 11:27
by Beherith
Thank you Kloot, i've applied your patch! I even glimpsed some frameloc coming soon :D Super excited!