Optimizing Math Functions - Page 2

Optimizing Math Functions

Discuss the source code and development of Spring Engine in general from a technical point of view. Patches go here too.

Moderator: Moderators

Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: Optimizing Math Functions

Post by Tobi »

It was done but was unmaintained for a long time even before I joined the community. (ENTER_SYNCED, ENTER_UNSYNCED, ENTER_MIXED, PUSH_CODE_MODE and POP_CODE_MODE macros)

So basically one would have to start from scratch marking code as synced or unsynced (at times I tried to figure out a way to programmatically do that but haven't found an easy way yet.)
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Alright, well, it seems this conversation has gone way above my head :lol:

However, about the earlier comments:

I have however updated the code a bit:

Code: Select all

float bsqrt(float number) {
    const float f = 1.5F;
    float x = number * 0.5F;
    float y  = number;
    int i  = * ( int * ) &y;
    i  = 0x5f375a86 - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( f - ( x * y * y ) );
	y  = y * ( f - ( x * y * y ) );
    return number * y;
}
imbaczek wrote:edit: paper states that 0x5f375a86 is a better initial guess, but it doesn't matter _that_ much.
Ah, that is correct. Changed above.
Kloot wrote:Why long instead of int as in Lomont's paper? Even
approximations should always have equal precision
on all hardware (even though in this case it doesn't
matter). ;)
Sorry, actually quake used a long, not sure why tho. fixed above.

On my tests, I believe streflop is 25x slower where Math.h's is about 10x slower.

Here are the percents of error I got [tested against math.h]:

Code: Select all

0 Iterations:
    Sqrt Total Time: 969
    Fast Sqrt Total: 94
    Average Error: 2.3142415708400920770770881063072010874748229980469 percent
    Highest Error: 3.4364936758788369175476873351726680994033813476562 percent at sqrt(977746)
1 Iteration:
    Sqrt Total Time: 968
    Fast Sqrt Total: 141
    Average Error: 0.09285881584973530722404433390693156979978084564209 percent
    Highest Error: 0.17513806048635793821688366733724251389503479003906 percent at sqrt(2701137)
2 Iterations:
    Sqrt Total Time: 968
    Fast Sqrt Total: 188
    Average Error: 0.00017350704573369476243355213895824817882385104894638 percent
    Highest Error: 0.00046997129594361061787413658130674321000697091221809 percent at sqrt(2677679)
3 Iterations:
    Sqrt Total Time: 953
    Fast Sqrt Total: 250
    Average Error: 1.9868740745263981125451856202257516770259826444089e-06 percent
    Highest Error: 1.1917364517739414403051274871092601870259386487305e-05 percent at sqrt(4196813)
I found it very interesting how well 0 iterations performs - very fast, and pretty darn accurate. Maybe it could be used for something like collisions which needs little accuracy. An average error of only 2 percent is a lot lower than I expected.

So I suppose this method is better for n<2 iterations. Also we could use the inverse sqrt too.
Kloot
Spring Developer
Posts: 1867
Joined: 08 Oct 2006, 16:58

Re: Optimizing Math Functions

Post by Kloot »

FWIW, I added this function to the float3 class (inlined so the
stackframe creation/destruction overhead doesn't defeat its
purpose). Since synthetic benchmarks almost never tell the
whole story, anyone wishing to investigate its potential proper
can do so by grep'ping all (significant) instances of Normalize()
by ANormalize() and then setting up some realistic (ie., likely to
arise during normal large-scale games) test scenario. Note that
you will want to ensure your tests are CPU-limited first, don't run
them with dynamic water and shadows enabled. ;)
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Hmm, nice.

You implemented InvSqrt, what about regular sqrt? Also, your initial guess magic number seems weird, why 0x5f3759d5? I would use 0x5f375a86.

Also the number of iterations needed must be considered. 0 iterations could sometimes be useful - it is very fast, and still relatively accurate.

Also, currently there are 2 iters, you could consider only 1 as it is 2x faster, still pretty accurate.
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

OK I see it was updated.

here is a simple patch which shows a fast sqrt using InvSqrt.
Though it doesn't actually implement it anywhere,someone can do that though.

Also, sqrt isn't the only thing that can have fast approximations, there are good ones for sine and cosine as well, and some others. Maybe that is going overboard though, just a heads up.

I will try to test these later...

EDIT: humm I'm not sure which types of file extensions are allowed, ill just put the patch inside code tags...

Code: Select all

Index: float3.h
===================================================================
--- float3.h	(revision 5559)
+++ float3.h	(working copy)
@@ -387,6 +387,16 @@
 		// x = x * (1.5f - xh * (x * x));
 		return x;
 	}
+	
+	/**
+	 * @brief fast approximation of sqrt(x)
+	 * @return approximation of sqrt(x)
+	 *
+	 * Uses InvSqrt and multiplies by x to get sqrt.
+	 */
+	inline float ASqrt(float x) {
+		return InvSqrt(x) * x;
+	}
 
 	/**
 	 * @brief normalizes the vector approximately
Last edited by Jonanin on 05 Mar 2008, 07:15, edited 2 times in total.
User avatar
Peet
Malcontent
Posts: 4384
Joined: 27 Feb 2006, 22:04

Re: Optimizing Math Functions

Post by Peet »

I don't think sin/cos approximations would work out, as they're inherent in things like rotation.
User avatar
ILMTitan
Spring Developer
Posts: 410
Joined: 13 Nov 2004, 08:35

Re: Optimizing Math Functions

Post by ILMTitan »

One thing you need to remember is that sin/cos on a computer are (nearly) always approximations due to the maximum size of a float. If we can increase speed significantly and keep the error below 1%, I think we should go for it.
BoredJoe
Posts: 139
Joined: 03 Mar 2006, 01:37

Re: Optimizing Math Functions

Post by BoredJoe »

Jonanin wrote:Hmm, nice.

You implemented InvSqrt, what about regular sqrt? Also, your initial guess magic number seems weird, why 0x5f3759d5? I would use 0x5f375a86.

Also the number of iterations needed must be considered. 0 iterations could sometimes be useful - it is very fast, and still relatively accurate.

Also, currently there are 2 iters, you could consider only 1 as it is 2x faster, still pretty accurate.
umm...did you change your number from the Quake 3 number to the one imbaczek suggested would be better?

I remember reading about this a long time ago, and essentially, even though there are numbers that are theoretically better initial guesses than the quake 3 guess, they actually perform worse than the Quake 3 number.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: Optimizing Math Functions

Post by Tobi »

Both float3::InvSqrt and float3::ASqrt should really be static; it doesn' t make sense to require user of the function to create a float3 object on the stack just to be able to calculate a (inverse) square root of a single float.

Ideally the functions don't belong in float3 at all anyway, since they have nothing to do with float3.
Kloot
Spring Developer
Posts: 1867
Joined: 08 Oct 2006, 16:58

Re: Optimizing Math Functions

Post by Kloot »

Indeed, I only put InvSqrt() there because ANormalize() makes
use of it and there aren't many other places where reciprocal
square roots have to be calculated. Introducing a MathHelper
compilation unit would be the better option though.
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Well, I've tested by simply replacing float3::Normalize with float3::ANormalize, and it seems to work perfect. I don't know if this is necessarily a good test, because I don't know where the approximation would be used if not everywhere.

However I haven't tested for speed improvements. Time profiler in float3 isn't working, however lurker said he got it working once.

What's a good way to test speed then?

EDIT: Not working because TimeProfiler includes float3.h itself.
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: Optimizing Math Functions

Post by imbaczek »

compile with debug=yes profile=yes (if you're using gcc) and check what's using the most cycles. normally sqrt shows up there near the top.

you could also try a replay of a big fight... it won't be in sync, but the framerates should be possible to compare objectively.
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Well, I get about ~18 fps with 40 krog vs 40 krog on delta siege revx3 with fast math, and about ~15 avg fps with regular sqrt.

Can someone test this also and see what they get?

Note this does not include when they all blow up in a chain reaction, I get a brief 1 fps for both then.
User avatar
lurker
Posts: 3842
Joined: 08 Jan 2007, 06:13

Re: Optimizing Math Functions

Post by lurker »

(I told him to compile with profiling repeatedly, and the specific command. I think he's deaf to it.)

Jonanin, I'll find my changes to get timeprofiler working and pastebin them.
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Yes, I heard you, but the time profiling wasn't working because of the reasons I stated above. Compiling 'with time profiling' didn't help at all.
User avatar
lurker
Posts: 3842
Joined: 08 Jan 2007, 06:13

Re: Optimizing Math Functions

Post by lurker »

I believe this is the fourth time I've said this, so I'll use bold text.

Compiling with profiling is not using timeprofiler.h
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Apparently it isn't clear to you, so I'll use bold text.

I know that.
Compiling 'with time profiling' didn't help at all.
Also note that the flag that you gave me was different from what imbaczek said above, you told me to use "use_profiling=yes" or something similiar.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Re: Optimizing Math Functions

Post by Tobi »

Seems it's time for me to drop in the discussion as author of current buildsystem.

How To Profile Spring:

Code: Select all

scons configure profile=yes && scons spring
then test

Code: Select all

gprof spring.exe > profile.txt
To actually compare data the test & environment of course must be identical everytime.

This profiling should not be confused with profile_generate and profile_use flags, which, respectively, generate a profiling binary and a binary that uses the profiling data generated from running the first binary to make better guesses for branch prediction / loop unrolling etc. Hence, these flags are purely for optimization (and didn't particularly help; no measurable speed difference in my test), while the other is purely aid in manual optimizing.
Jonanin
Posts: 107
Joined: 13 Jan 2008, 21:34

Re: Optimizing Math Functions

Post by Jonanin »

Hmm, when I do gprof spring.exe > profile.txt I get:

Code: Select all

gmon.out: no such file or directory
Also, exactly how identical do the tests have to be?

Is it fine to just create units at the exact same location both times?
imbaczek
Posts: 3629
Joined: 22 Aug 2006, 16:19

Re: Optimizing Math Functions

Post by imbaczek »

best way is to play a replay.

you must run gprof after you've ran the game, gmon.out is a file that should get created when you exit spring.
Post Reply

Return to “Engine”