RS File System

sini · September 29, 2015

You may have all wondered at one point in time, how does RS update and manage its assets. It's actually pretty simple, and a lot of this information can be found on various RS private server websites. Reading data from the cache is rather important on those fronts, so its common information. However in bot software, it's not as important but is still good information, in my opinion, to understand.

Virtual File System

All of the Runescape files are organized in a virtual file system. This file system is comprised of a top level file, and a second level index coordinate. These coordinates could be represented simply as the form (x, y); x being the volume identifier and y being the file identifier. These identifiers are numeric, and the volume identifier has a maximum possible value of 255, or the maximum unsigned value of a byte or 8 bit integer type. The file identifier has a maximum possible value of 65535, or the maximum unsigned value of a short or 16 bit integer type. However its important to note in modern versions of runescape its possible to have a possible maximum value of 4.294.967.295 or the maximum signed value of a 32 bit integer type. This value is encoded as a "smart" type, which uses the first bit to designate the length of the type. When the most significant bit (big endian) is set to 1, then the value is 2 words long, otherwise its a single word long.

The files can be broken into entries, when there are multiple entries to a file. This coordinate could be described by (x, y, z) with the z coordinate referring to the child identifier. Likewise, the child identifier depending on the Runescape engine is either a 16 bit integer type or a 32/16 bit integer type or 'smart' type. The z coordinate is considered auxillary information because of the nature of the file system, however it's enforced in the engine. If a file only has one entry then by default when you are referencing to a file coordinate as (x, y) the z coordinate is implied to be 0, and the file is not encoded in the archive format.

So put together:

Coordinate := (x, y, z)
x - volume
y - file
z - child (default: 0)

All files in the cache are packed using a container format which is a header consisting of a compression, packed, optional unpacked size for compressed container types, and then the container data which the number of bytes is equal to the packed size. Containers support no compression, headerless BZIP2 compression, and GZIP compression. When I say headerless BZIP2 compression, I mean that the BZIP2 header is stripped. No idea why. Probably to save bandwidth. Containers can be encrypting using XTEA, only the unpacked size if applicable and data are encrypted.

int8 : compression
int32 : packedSize
int32 : unpackedSize (for BZIP2 compression and GZIP compression)
byte[packedSize]: data

An archive is a data format which contains a collection of entries for a file. The footer of the data format contains information about how many passes it takes to read an entry completely, and the lengths of each entry. Before reading an entry you must know the number of entries in the archive to align to the offset in the file to read the entry lengths. This information can be found in the volume meta information tables, which contain information about each file in the volume, file entry information, volume attributes, volume revision.

A volume is a collection of similarly formatted files and could be imagined as a top level directory. A volume has a consistently reserved identifier across all revisions and volume identifier 255 is reserved for volume meta information tables. Example:

config/
tables/
anims/
skins/
models/

JS5 Protocol

The JS5 protocol, for which we do not understand the acronym meaning, is the transfer protocol to update the local game cache on a users computer. It's also used by the servers to transfer server files between different world instances. This is known because in the cache there are references to files that you may request, but will get null entries for. More information about that can be found on Rune-Server. The JS5 protocol is similar to, but not exactly like, HTTP chunked encoding. The client has a request queue of up to twenty priority and twenty normal requests which can be made to the service. The service then will serve files back to the client, serving the priority requested files first, before the normal or passively requested files.

The service will write back 520 byte chunks back to the client. This calculation does take into consideration the container format, and container header bytes will be included in the calculation for how many bytes are considered in a chunk. This makes for a very confusing alignment situation when writing compressed versus non-compressed entries. The client, at its peak, can read up to 2 MB/second from the server. However it's likely to be lower because of the overhead and inefficiency of the protocol to serve files to the client. The client will take an entire read cycle, to read a response header or even the byte to check if the currently read file is still being served or was dropped.

To connect to the JS5 service, you first have to connect to any RS world instance and then write byte 15 along the revision as a 32 bit integer type. The server will respond with a status code, zero being that the connection succeeded, and other statuses before for various errors. This is consistent across all Runescape services and isn't too important to discuss for the purpose of this thread.

The structure for a response, is:

Response (First Chunk):

int8 : volume
int16: file
int8: settings
byte[] data

The first chunk has the compression stripped out, or I guess you can say it has a flag appended in the settings which is the compression and if the header is for a priority request. It does not matter to the client the order that responses are served however priority requests are reserved for files that are needed to either load the client or play the game so the server does take that into consideration. After every chunk after the first chunk, the value 255 or -1 is written as a check to assure that the client and server are synchronized.

Every chunk after until all the bytes have been written for the archive:

int8 : check (always 255 or -1)
byte[519]: data

Why does Jagex do it this way? I have no clue but its very irritating to implement from an emulation perspective because caching data for faster writes is near impossible.

Volume Meta Tables

Volume 255 is reserved for volume meta tables, as stated earlier. The files in this table are a custom format, and include the information about the volume such as if the file entries have a string hash identifier, whirlpool verification hash, the files and their entries. The format varies engine to engine, with the first byte of the table being a protocol which alerts the parser how to decode the entry. The first noted protocol was 5, in build 414. We do not know if this correlates to JS5, but I'm willing to wager that isn't coincidence.

The string hash Jagex uses for naming entries is Djb2.

Cache Format

The cache, or the file system on the users machine is exactly what it sounds like. Its designed so that files can expand and simply occupy new blocks of data in the blob file. It's not expected for files to shrink in size. The cache is comprised of index files, and a blob file. The index files contain references to blocks in the blob file. The blob file contains blocks contain chunks of files and identifying information for read and write validation. The blocks in the blob are all 520 bytes in length, with an 8 byte header and a 512 byte file chunk. The first block, or block 0 in the blob file is empty and reserved for representing the end of an entry. The references in an index file are 6 bytes in size and contain the length of an entry and a pointer to the first block that a file occupies in the blob file. Each block in the blob file points to the next block, and most of the time these are linearly aligned except in cases where files are overwritten and blocks are appended to the end of the blob file which may not be aligned with previously written blocks for a file entry.

My explaining is pretty bad but I hope you learned a few things. Here is the code for reading archive, meta tables, and files from the cache:

https://gist.github.com/Hadyn/4331081ec82b576e12f8

Sign In

RS File System

Recommended Posts

sini 32

Archived

Browse

Support

Important Information