So we figured out the structure of the file and found the meshdata inside. We’ve got the most important part of the job done, but we still don’t understand most of the file’s “chunk” types. So we keep digging…
Looking at the shorter chunks of the file, you’ll find they contain descriptions and metadata for the mesh. This becomes obvious looking at the ASCII pane of the hex editor, because there are plaintext names to be seen.
If you go through the work of finding ‘A3_KIT00_HOOD_A’ in the file for the Audi, and then plot the meshdata in the next mesh section, sure enough you see a hood. ‘A3_KIT00_HOOD_B’ is a lower-poly version of the same hood, and ‘A3_KIT00_HOOD_C’ is a step coarser still. Games often have the same mesh at varying levels of detail so they can render close objects in detail while reducing load on the GPU for objects in the distance.
The mesh part strings always appears in sections with the 0x00134011 header (11 40 13 00 in your hex editor, stupid endianness). Looking at the data in this section, you usually see a bunch of zeros, followed by two 16 bit ints. Honestly, I don’t know what these do yet.
16 bytes in to the data of the section (24 bytes after the section start) you always find a 4 byte number of some sort. It doesn’t translate to a reasonable value as a float or an int, so what is it?
A big hint comes when you use ctrl-f to see if it appear s elsewhere in the file. There are two other occurrences–and both are tables of odd 32 bit numbers like it.
You’ll find that every submesh within the mesh has a metadata section with this strange 4 byte code 16 bytes into the data, and each of these 4 byte ints appear in order in the tables at the beginning of the file. What is going on?
The answer is that these numbers are unique identifiers. Each part of the mesh has an identifier that can be referenced quickly (as opposed to searching for the name, which involves slow string comparisons).
Unique identifiers like this could be randomly assigned (provided you check for duplicates), but there is a hint that they are predictable. In the list of identifiers, you’ll notice that many of them come in triplets, where each number in a triplet is separated by one. The triplets don’t show obvious relation to one another though, so they aren’t assigned consecutively. If you look at the part names throughout the file, you notice they also come in triplets, usually with an A, B, and C level of detail.
What’s going on is that the strings we saw before are being hashed. A cryptographic algorithm like SHA-1 wouldn’t have these patterns though. This is a simple hash. After reading about and testing a few possibilities, I found it to be a times 33 hash. Here’s how the hash works:
- Add the ASCII value of the first character to an int variable.
- Multiply the int by 33.
- Add the second char to the int.
- Multiply by 33.
- And so on…until you reach the end of the string
There is one odditty of NFS’s implementation of this hash. The variable which accumulates the hash is initialized to -1 (0xFFFFFFFF). Also note that the trailing \0 is not hashed. Here it is in c++:
Realize that the int will wrap around for long strings. This is fine, it produces a repeatable identifier nevertheless (as long as you always use the same number of bits for your int).
The program below hashes a string and prints out the result in little endian hex. Pick a different mesh part name and hash it to see it at work.
Now any string in the game can be hashed. This bit of knowledge can be handy when trying to locate assets later.