Hashing

AF · Post by AF » 10 Dec 2006, 03:09

How does UnitSync and spring generate hashes of mods and maps? What type of hash are they? base 64 encoded MD5's? Are sdd archives hashed at all? Is the hashing iterative for archives depending on other archives? Or is it just the initial archive referenced? How do you generate hashes for maps that're not in archives but in the main spring directory?

Licho · Post by **Licho** » 10 Dec 2006, 04:29

Md5 woudlnt fit into 4 byte integer unitsync returns

https://taspring.clan-sy.com/svn/spring ... canner.cpp

It's apparently using CRC, by the rough check of code it checks all files in archive.

Code: Select all

bool CArchiveScanner::GetCRC(const string& filename, unsigned int& crc)
{
	FILE* fp = fopen(filename.c_str(), "rb");
	if (!fp)
		return false;

	crc ^= 0xFFFFFFFF;
	unsigned char* buf = new unsigned char[100000];
	size_t bytes;
	do {
		bytes = fread((void*)buf, 1, 100000, fp);
		for (size_t i = 0; i < bytes; ++i)
			crc = (crc>>8) ^ crcTable[ (crc^(buf[i])) & 0xFF ];
	} while (bytes == 100000);

	delete[] buf;
	fclose(fp);

	crc ^= 0xFFFFFFFF;
	return true;
}

[/code]

Tobi · Post by **Tobi** » 10 Dec 2006, 12:48

CRC.

No.

Yes, they're hashed by CRC'ing the stream of bytes consisting of filename1|content1|filename2|content2|etc.

Yes, CRC's of multiple archives XOR'ed together, excluding files in base, because they are autogenerated hence may have different filetimes in them.

Maps not in an archive just get a 0 checksum IIRC, it isn't really supported. You can just put them in an .sdd after all, theres no reason at all to put maps directly in spring directory... Same applies to mods directly in spring directory.

AF · Post by AF » 10 Dec 2006, 22:49

So if for example you have the AA mod, and you choose Standard AA it will hash that variant and not bother with the base AA content archive?

hmm I'm confused, whats the difference between base archives and all the other archives? As I understand it you have the root archive your asking for a checksum for and then everything else branching off is a base archive......

And by "filename1|content1|filename2|content2|" what exactly do you means? Opening an archive and going through each file inside checksumming its filename, xor'ing it, checksumming its contents, xor'ing it, repeating for every file? Adding all the filenames and content in a single string with '|' seperators and checksumming that one string?

Or do you mean just checksumming a stream comprising of the archives contents returned by some archive library?

:s

Tobi · Post by **Tobi** » 10 Dec 2006, 23:35

With "base", I ment archives in the "base" directory of the spring install, e.g. springcontent.sdz, otacontent.sdz etc. Nothing more, nothing less.

By "filename1|content1|filename2|content2|etc." I mean the CRC is calculated over the stream of bytes generated by concatenating for each file in the .sdd archive, the filename and the entire file's contents. The '|' separators are just to make the post more readable.

EDIT:
example:

an .sdd dir containing two files:

maps/foo.txt:

Code: Select all

hello

maps/bar.txt:

Code: Select all

world!

Would cause archivescanner to generate a CRC of approximately the following string:

"maps/foo.txthellomaps/bar.txtworld!"

This CRC would be the archive's CRC.
Note that I'm not sure whether terminating NULLs are included in the filenames, check the source for that. Oh, and also imagine there are newlines in the string.

AF · Post by AF » 11 Dec 2006, 02:07

Ah and what about an AA variant? Would it be the CRC of all its files or the CRC of all its files and those of the base AA mod it references?

Tobi · Post by **Tobi** » 11 Dec 2006, 10:49

Every file the mod depends on is XOR'ed in the final checksum, unless that file is in the "base" directory in the spring install. So AABase.sdz is checksummed too if you query checksum of AASS.sdz (or whatever exact filenames were).

Note that there are two levels of checksumming, one that checksums the archives. This generates the checksum that's stored in ArchiveCacheV4.txt. And one that combines (by XOR) multiple checksums together based on the dependency chain of the mod/map.

AF · Post by AF » 12 Dec 2006, 05:33

In other words it checksums each archive and then writes it out to archivecache with tis dependencies, then when you want a checksum you xor the checksum with the dependencies cached.....

Spring RTS Engine

Hashing

Hashing