Building with Profile Guided Optimization (GCC 4.4)

Building with Profile Guided Optimization (GCC 4.4)

Discuss the source code and development of Spring Engine in general from a technical point of view. Patches go here too.

Moderator: Moderators

Post Reply
YokoZar
Posts: 883
Joined: 15 Jul 2007, 22:02

Building with Profile Guided Optimization (GCC 4.4)

Post by YokoZar »

Profile Guided Optimization can cause substantial speedups in some games, particularly ones that are CPU-bound like Spring is. It's a newish feature as of GCC 4.3, however GCC 4.4 supports SMP and profiling directories.


Usage is fairly straightforward: add the -fprofile-generate flag when compiling, then run the program for a while in "real world" usage (a replay should be fine). During this time the program will run slow, however profiling data will be put alongside the source files.

Once a profile is generated, recompile with the -fprofile-use flag. And then you're done - optimized Spring.


Has anyone tried this yet? I couldn't quite figure out how to easily hack in extra flags to Spring's build process yet, or for that matter which build system to use. I'd like to make it a part of the automated build process, especially for user-downloaded packages. In theory it's just a few extra lines in the package build scripts: compile with -fprofile-generate, store profile in a readable temp folder by adding -fprofile-dir, run a replay, then rebuild and publish.
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by imbaczek »

it's been done, scons has support for it. results weren't impressive (1-3% improvement) iirc, but it hasn't been tried on gcc 4.4.
YokoZar
Posts: 883
Joined: 15 Jul 2007, 22:02

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by YokoZar »

imbaczek wrote:it's been done, scons has support for it. results weren't impressive (1-3% improvement) iirc, but it hasn't been tried on gcc 4.4.
This 1-3% improvement isn't much when the game is "normal", but when your cpu gets maxed and there's a flood of units on the screen I bet it can translate into a significant framerate boost (or at least a bit longer before the game is too much to handle).

Still, it's a substantial chunk (akin to SSE2->SSE3), and it's basically for free if we just alter the release process a bit.


That said, could someone tell me how to do this in scons? Spring's scons build is a bit strange to me.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by Tobi »

YokoZar wrote:and it's basically for free if we just alter the release process a bit.
Now try to get a headless build server to run through some replay files automatically in the release process ;-)

Not sure where bibim does his builds tho; if BuildServ is actually running on an always on desktop this may be possible.

Also hoijui's work on a full headless client might make this possible for the sim code even on a headless server, although I doubt -fprofile-use can use a profile generated from a totally different -fprofile-generate build. (e.g. generate profile with headless client, use in normal client.)

That may need proper separation of Sim code into a seperately compiled library, which is identical between headless and normal clients.
Auswaschbar
Spring Developer
Posts: 1254
Joined: 24 Jun 2007, 08:34

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by Auswaschbar »

To be representative, a replay must be looong. And since buildserv doesn't has that unlimited CPU ressources, it will take even loooooonger to build a release.

Also, what will you take as replay? BA, because everyone plays it? S44, because it is legal to install? SpeedMetal? DeltaSiege 8v8?

Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by Tobi »

Auswaschbar wrote:To be representative, a replay must be looong. And since buildserv doesn't has that unlimited CPU ressources, it will take even loooooonger to build a release.
Truth.
Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
I think it can be assumed optimizing the simulation code for speed using profile guided optimization (and not touching the rendering code at all) does not reduce the performance of rendering code.
User avatar
AF
AI Developer
Posts: 20687
Joined: 14 Sep 2004, 11:32

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by AF »

Would ti be safe to say simulation is the bottleneck at the moment, and optimizing simulation can only speed up rendering
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by imbaczek »

different mods exercise different code paths. PGO can make one mod faster and another one slower, because it wasn't profiled with it. this happens when you try PGO on a language interpreter (e.g. python), use a benchmark for generating a profile and run a different benchmark with the optimized build. it's not as easy as it sounds.
Auswaschbar
Spring Developer
Posts: 1254
Joined: 24 Jun 2007, 08:34

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by Auswaschbar »

imbaczek wrote:different mods exercise different code paths. PGO can make one mod faster and another one slower, because it wasn't profiled with it. this happens when you try PGO on a language interpreter (e.g. python), use a benchmark for generating a profile and run a different benchmark with the optimized build. it's not as easy as it sounds.
That's similar to what I wanted to say about rendering.
YokoZar
Posts: 883
Joined: 15 Jul 2007, 22:02

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by YokoZar »

imbaczek wrote:different mods exercise different code paths. PGO can make one mod faster and another one slower, because it wasn't profiled with it. this happens when you try PGO on a language interpreter (e.g. python), use a benchmark for generating a profile and run a different benchmark with the optimized build. it's not as easy as it sounds.
While this is certainly true, I'm not sure if it's going to matter much in terms of the optimization data generated. Most of the benefit from profile guided optimization can come from even a small amount of data - using any mod, for instance, the compiler would quickly learn that the "are all units dead" check for the end of the game is almost always false.
Auswaschbar wrote:To be representative, a replay must be looong. And since buildserv doesn't has that unlimited CPU ressources, it will take even loooooonger to build a release.
Yes, if the buildserv is slow this could add as long as half a day to the release process. That's probably not good for daily builds, but when we make a real release it could be worth it.
Also, what will you take as replay? BA, because everyone plays it? S44, because it is legal to install? SpeedMetal? DeltaSiege 8v8?
I don't think this actually matters much.
Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
If there's no profiling data for a particular piece of the code, then GCC will fall back to its old heuristic optimizations.
Manoa
Posts: 79
Joined: 19 May 2008, 18:51

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by Manoa »

I have successfully compiled and executed 0.79.1.2 in profiling mode, but because was using GCC 4.2.4, it had some profiling bugs and caused errors where they should be warnings during feedback building.

out of testing compilers, comparing 4.2.4 to 4.4.0, I found 4.2.4 faster, currently compiling 4.3.3 to see if they fixed the pgo bugs and maybe it's still as fast as 4.2.4.

testing complete, results can be found here:
http://anonym.to?http://gcc.gnu.org/bug ... i?id=35671
seems that i am not the only one found out about this, but what's the real bad news is that pgo is nowhere near as good as it is expected to be, in my tests it actually degraded performance, obviously synthetic benchmark are small code, but it is an indication of the quality of pgo within gcc, so far it looks like there is not much to look at, considering the results, I find the mentioned indicated 1-3% improvement surprising.

for a more comprehensive analysis of the various compiler versions: http://manoa.flnet.org/linux/compilers.html
Last edited by Manoa on 31 Jul 2009, 02:02, edited 3 times in total.
User avatar
aegis
Posts: 2456
Joined: 11 Jul 2007, 17:47

Re: Building with Profile Guided Optimization (GCC 4.4)

Post by aegis »

I have four dual 3ghz xeon servers with 4gb of ram just sitting and not doing anything... if I had a way to boot them, I could donate them to the spring build process.
Post Reply

Return to “Engine”