Profile Guided Optimization can cause substantial speedups in some games, particularly ones that are CPU-bound like Spring is. It's a newish feature as of GCC 4.3, however GCC 4.4 supports SMP and profiling directories.
Usage is fairly straightforward: add the -fprofile-generate flag when compiling, then run the program for a while in "real world" usage (a replay should be fine). During this time the program will run slow, however profiling data will be put alongside the source files.
Once a profile is generated, recompile with the -fprofile-use flag. And then you're done - optimized Spring.
Has anyone tried this yet? I couldn't quite figure out how to easily hack in extra flags to Spring's build process yet, or for that matter which build system to use. I'd like to make it a part of the automated build process, especially for user-downloaded packages. In theory it's just a few extra lines in the package build scripts: compile with -fprofile-generate, store profile in a readable temp folder by adding -fprofile-dir, run a replay, then rebuild and publish.
Building with Profile Guided Optimization (GCC 4.4)
Moderator: Moderators
Re: Building with Profile Guided Optimization (GCC 4.4)
it's been done, scons has support for it. results weren't impressive (1-3% improvement) iirc, but it hasn't been tried on gcc 4.4.
Re: Building with Profile Guided Optimization (GCC 4.4)
This 1-3% improvement isn't much when the game is "normal", but when your cpu gets maxed and there's a flood of units on the screen I bet it can translate into a significant framerate boost (or at least a bit longer before the game is too much to handle).imbaczek wrote:it's been done, scons has support for it. results weren't impressive (1-3% improvement) iirc, but it hasn't been tried on gcc 4.4.
Still, it's a substantial chunk (akin to SSE2->SSE3), and it's basically for free if we just alter the release process a bit.
That said, could someone tell me how to do this in scons? Spring's scons build is a bit strange to me.
Re: Building with Profile Guided Optimization (GCC 4.4)
Now try to get a headless build server to run through some replay files automatically in the release processYokoZar wrote:and it's basically for free if we just alter the release process a bit.
Not sure where bibim does his builds tho; if BuildServ is actually running on an always on desktop this may be possible.
Also hoijui's work on a full headless client might make this possible for the sim code even on a headless server, although I doubt -fprofile-use can use a profile generated from a totally different -fprofile-generate build. (e.g. generate profile with headless client, use in normal client.)
That may need proper separation of Sim code into a seperately compiled library, which is identical between headless and normal clients.
-
- Spring Developer
- Posts: 1254
- Joined: 24 Jun 2007, 08:34
Re: Building with Profile Guided Optimization (GCC 4.4)
To be representative, a replay must be looong. And since buildserv doesn't has that unlimited CPU ressources, it will take even loooooonger to build a release.
Also, what will you take as replay? BA, because everyone plays it? S44, because it is legal to install? SpeedMetal? DeltaSiege 8v8?
Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
Also, what will you take as replay? BA, because everyone plays it? S44, because it is legal to install? SpeedMetal? DeltaSiege 8v8?
Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
Re: Building with Profile Guided Optimization (GCC 4.4)
Truth.Auswaschbar wrote:To be representative, a replay must be looong. And since buildserv doesn't has that unlimited CPU ressources, it will take even loooooonger to build a release.
I think it can be assumed optimizing the simulation code for speed using profile guided optimization (and not touching the rendering code at all) does not reduce the performance of rendering code.Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
Re: Building with Profile Guided Optimization (GCC 4.4)
Would ti be safe to say simulation is the bottleneck at the moment, and optimizing simulation can only speed up rendering
Re: Building with Profile Guided Optimization (GCC 4.4)
different mods exercise different code paths. PGO can make one mod faster and another one slower, because it wasn't profiled with it. this happens when you try PGO on a language interpreter (e.g. python), use a benchmark for generating a profile and run a different benchmark with the optimized build. it's not as easy as it sounds.
-
- Spring Developer
- Posts: 1254
- Joined: 24 Jun 2007, 08:34
Re: Building with Profile Guided Optimization (GCC 4.4)
That's similar to what I wanted to say about rendering.imbaczek wrote:different mods exercise different code paths. PGO can make one mod faster and another one slower, because it wasn't profiled with it. this happens when you try PGO on a language interpreter (e.g. python), use a benchmark for generating a profile and run a different benchmark with the optimized build. it's not as easy as it sounds.
Re: Building with Profile Guided Optimization (GCC 4.4)
While this is certainly true, I'm not sure if it's going to matter much in terms of the optimization data generated. Most of the benefit from profile guided optimization can come from even a small amount of data - using any mod, for instance, the compiler would quickly learn that the "are all units dead" check for the end of the game is almost always false.imbaczek wrote:different mods exercise different code paths. PGO can make one mod faster and another one slower, because it wasn't profiled with it. this happens when you try PGO on a language interpreter (e.g. python), use a benchmark for generating a profile and run a different benchmark with the optimized build. it's not as easy as it sounds.
Yes, if the buildserv is slow this could add as long as half a day to the release process. That's probably not good for daily builds, but when we make a real release it could be worth it.Auswaschbar wrote:To be representative, a replay must be looong. And since buildserv doesn't has that unlimited CPU ressources, it will take even loooooonger to build a release.
I don't think this actually matters much.Also, what will you take as replay? BA, because everyone plays it? S44, because it is legal to install? SpeedMetal? DeltaSiege 8v8?
If there's no profiling data for a particular piece of the code, then GCC will fall back to its old heuristic optimizations.Btw, I think optimising spring for running without graphics might be a bad idea, because it could run slower with graphics enabled then.
Re: Building with Profile Guided Optimization (GCC 4.4)
I have successfully compiled and executed 0.79.1.2 in profiling mode, but because was using GCC 4.2.4, it had some profiling bugs and caused errors where they should be warnings during feedback building.
out of testing compilers, comparing 4.2.4 to 4.4.0, I found 4.2.4 faster, currently compiling 4.3.3 to see if they fixed the pgo bugs and maybe it's still as fast as 4.2.4.
testing complete, results can be found here:
http://anonym.to?http://gcc.gnu.org/bug ... i?id=35671
seems that i am not the only one found out about this, but what's the real bad news is that pgo is nowhere near as good as it is expected to be, in my tests it actually degraded performance, obviously synthetic benchmark are small code, but it is an indication of the quality of pgo within gcc, so far it looks like there is not much to look at, considering the results, I find the mentioned indicated 1-3% improvement surprising.
for a more comprehensive analysis of the various compiler versions: http://manoa.flnet.org/linux/compilers.html
out of testing compilers, comparing 4.2.4 to 4.4.0, I found 4.2.4 faster, currently compiling 4.3.3 to see if they fixed the pgo bugs and maybe it's still as fast as 4.2.4.
testing complete, results can be found here:
http://anonym.to?http://gcc.gnu.org/bug ... i?id=35671
seems that i am not the only one found out about this, but what's the real bad news is that pgo is nowhere near as good as it is expected to be, in my tests it actually degraded performance, obviously synthetic benchmark are small code, but it is an indication of the quality of pgo within gcc, so far it looks like there is not much to look at, considering the results, I find the mentioned indicated 1-3% improvement surprising.
for a more comprehensive analysis of the various compiler versions: http://manoa.flnet.org/linux/compilers.html
Last edited by Manoa on 31 Jul 2009, 02:02, edited 3 times in total.
Re: Building with Profile Guided Optimization (GCC 4.4)
I have four dual 3ghz xeon servers with 4gb of ram just sitting and not doing anything... if I had a way to boot them, I could donate them to the spring build process.