Synced floating point math

Synced floating point math

Discussion between Spring developers.
Post Reply
Auswaschbar
Spring Developer
Posts: 1254
Joined: 24 Jun 2007, 08:34

Synced floating point math

Post by Auswaschbar »

Because everyone wants to make spring faster I did some tests which let to very interesting results:


1. The math implementation of streflop (sin, sqrt ...) is almost 20 times slower than the one from <cmath>
2. The math implementation of streflop is probably used almost everywhere in the enging because of:

Code: Select all

#include "lib/streflop/streflop.h"
 using namespace streflop;
in float3.h
3. SyncedFloat3 doesn't use streflop (doesn't this mean its less synced than the unsynced one :shock: )

Because of this big difference in speed we should really consider using streflop only where necessary (and not in graphics, rendering etc.)

PS: I used the following for measuring speed:

Code: Select all

#include <iostream>
#include <SDL/SDL_timer.h>

#define STREFLOP_X87
#include "streflop.h"
#include <cmath>
using namespace streflop;

int main(int argc, char *argv[])
{
	float dummy = 0.0f;
	unsigned start = SDL_GetTicks();
	
	for (unsigned i = 0; i < 40000000; ++i)
	{
		float test = 0.0f, test2 = 1.0f;
		test = std::sin(test2);
		test2 = std::sqrt(test);
		float result = test * test2;
		dummy += result;
	}
	
	unsigned end = SDL_GetTicks();
	std::cout << "Time cmath: " << end - start << " Result: " << dummy << std::endl;
	
	dummy = 0.0f;
	start = SDL_GetTicks();
	
	for (unsigned i = 0; i < 40000000; ++i)
	{
		float test = 0.0f, test2 = 1.0f;
		test = streflop::sin(test2);
		test2 = streflop::sqrt(test);
		float result = test * test2;
		dummy += result;
	}
	
	end = SDL_GetTicks();	
	std::cout << "Time streflop: " << end - start << " Result: " << dummy << std::endl;
	
	return 0;
}
And the output is like this:

Code: Select all

Time cmath: 261 Result: 1.67772e+07
Time streflop: 6677 Result: 1.67772e+07
Kloot
Spring Developer
Posts: 1867
Joined: 08 Oct 2006, 16:58

Post by Kloot »

SyncedFloat3 is just a typedef for float3
unless SYNCDEBUG or SYNCCHECK are
defined, so for regular builds there is no
difference.

edit: but the latter is always defined? hmm
Last edited by Kloot on 09 Dec 2007, 16:23, edited 1 time in total.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Post by Tobi »

There are two tricky things when changing this however:

1) We don't have a good unit test for sync (wrt FPU), and it's probably quite hard to write a good one, given that in my huge sync test only 7 minutes of game showed a desync between GCC 3.4 and GCC 4.0

2) Simulation and other code aren't as well separated as one would wish. For example there may be helper functions that do floating point stuff, which are used in both synced code AND unsynced code (e.g. float3::length(), float3::normalize(), maybe stuff in CGameHelper)

Also keep in mind that the total benefit won't be that much. I recall from profiles that only streflop::sqrt was somewhere near the top (maybe at the top even), but cos and sin where much further down (maybe 50th place or so), suggesting you'd be wasting time micro optimizing the 99% most unused code for anything but sqrt.

Implementing a new CQuadField using a faster algorithm, making nano particles unsynced, or using a vertex shader for unsynced projectile rendering (ie. macro optimization) probably gives you orders of magnitude larger performance increase.

Anyway, if you (or anyone else) wants to change it anyway, I'd say an approach that clearly marks each sin/cos/sqrt as the streflop/std one would be best. So, forbid #include <math.h>, use only #include <cmath> and #include<streflop.h>, and change all sqrt etc. to either std::sqrt or streflop::sqrt, depending on whether it is run in a synced or unsynced context.
Last edited by Tobi on 09 Dec 2007, 16:30, edited 1 time in total.
Tobi
Spring Developer
Posts: 4598
Joined: 01 Jun 2005, 11:36

Post by Tobi »

Kloot wrote:SyncedFloat3 is just a typedef for float3
unless SYNCDEBUG or SYNCCHECK are
defined, so for regular builds there is no
difference.

edit: but the latter is always defined? hmm
Yes.

But it doesn't really matter, since SyncedFloat3.h is currently never included without float3.h (remember float3.h is already included in StdAfx.h)

It's still a bug tho, SyncedFloat3.h should have the streflop lines too.
Post Reply

Return to “Dedicated Developer Discussion”