summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPatric Stout <truebrain@openttd.org>2021-06-15 14:54:10 +0200
committerPatric Stout <github@truebrain.nl>2021-07-02 22:21:58 +0200
commit9643a1b80ac9d5c2e3b2781e7536de874ee3f2bb (patch)
treefa272b47eee34db6a46f8fc77c836ad29d516f15
parent1ed240590799d0a2b724b226a512906908057e79 (diff)
downloadopenttd-9643a1b80ac9d5c2e3b2781e7536de874ee3f2bb.tar.xz
Doc: explain the binary structure of our (new) savegames
-rw-r--r--docs/savegame_format.md175
1 files changed, 175 insertions, 0 deletions
diff --git a/docs/savegame_format.md b/docs/savegame_format.md
new file mode 100644
index 000000000..e4e2e4000
--- /dev/null
+++ b/docs/savegame_format.md
@@ -0,0 +1,175 @@
+# OpenTTD's Savegame Format
+
+Last updated: 2021-06-15
+
+## Outer container
+
+Savegames for OpenTTD start with an outer container, to contain the compressed data for the rest of the savegame.
+
+`[0..3]` - The first four bytes indicate what compression is used.
+In ASCII, these values are possible:
+
+- `OTTD` - Compressed with LZO (deprecated, only really old savegames would use this).
+- `OTTN` - No compression.
+- `OTTZ` - Compressed with zlib.
+- `OTTX` - Compressed with LZMA.
+
+`[4..5]` - The next two bytes indicate which savegame version used.
+
+`[6..7]` - The next two bytes can be ignored, and were only used in really old savegames.
+
+`[8..N]` - Next follows a binary blob which is compressed with the indicated compression algorithm.
+
+The rest of this document talks about this decompressed blob of data.
+
+## Data types
+
+The savegame is written in Big Endian, so when we talk about a 16-bit unsigned integer (`uint16`), we mean it is stored in Big Endian.
+
+The following types are valid:
+
+- `1` - `int8` / `SLE_FILE_I8` -8-bit signed integer
+- `2` - `uint8` / `SLE_FILE_U8` - 8-bit unsigned integer
+- `3` - `int16` / `SLE_FILE_I16` - 16-bit signed integer
+- `4` - `uint16` / `SLE_FILE_U16` - 16-bit unsigned integer
+- `5` - `int32` / `SLE_FILE_I32` - 32-bit signed integer
+- `6` - `uint32` / `SLE_FILE_U32` - 32-bit unsigned integer
+- `7` - `int64` / `SLE_FILE_I64` - 64-bit signed integer
+- `8` - `uint64` / `SLE_FILE_U64` - 64-bit unsigned integer
+- `9` - `StringID` / `SLE_FILE_STRINGID` - a StringID inside the OpenTTD's string table
+- `10` - `str` / `SLE_FILE_STRING` - a string (prefixed with a length-field)
+- `11` - `struct` / `SLE_FILE_STRUCT` - a struct
+
+### Gamma value
+
+There is also a field-type called `gamma`.
+This is most often used for length-fields, and uses as few bytes as possible to store an integer.
+For values <= 127, it uses a single byte.
+For values > 127, it uses two bytes and sets the highest bit to high.
+For values > 32767, it uses three bytes and sets the two highest bits to high.
+And this continues till the value fits.
+In a more visual approach:
+```
+ 0xxxxxxx
+ 10xxxxxx xxxxxxxx
+ 110xxxxx xxxxxxxx xxxxxxxx
+ 1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx
+ 11110--- xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
+```
+
+## Chunks
+
+Savegames for OpenTTD store their data in chunks.
+Each chunk contains data for a certain part of the game, for example "Companies", "Vehicles", etc.
+
+`[0..3]` - Each chunk starts with four bytes to indicate the tag.
+If the tag is `\x00\x00\x00\x00` it means the end of the savegame is reached.
+An example of a valid tag is `PLYR` when looking at it via ASCII, which contains the information of all the companies.
+
+`[4..4]` - Next follows a byte where the lower 4 bits contain the type.
+The possible valid types are:
+
+- `0` - `CH_RIFF` - This chunk is a binary blob.
+- `1` - `CH_ARRAY` - This chunk is a list of items.
+- `2` - `CH_SPARSE_ARRAY` - This chunk is a list of items.
+- `3` - `CH_TABLE` - This chunk is self-describing list of items.
+- `4` - `CH_SPARSE_TABLE` - This chunk is self-describing list of items.
+
+Now per type the format is (slightly) different.
+
+### CH_RIFF
+
+(since savegame version 295, this chunk type is only used for MAP-chunks, containing bit-information about each tile on the map)
+
+A `CH_RIFF` starts with an `uint24` which together with the upper-bits of the type defines the length of the chunk.
+In pseudo-code:
+
+```
+type = read uint8
+if type == 0
+ length = read uint24
+ length |= ((type >> 4) << 24)
+```
+
+The next `length` bytes are part of the chunk.
+What those bytes mean depends on the tag of the chunk; further details per chunk can be found in the source-code.
+
+### CH_ARRAY / CH_SPARSE_ARRAY
+
+(this chunk type is deprecated since savegame version 295 and is no longer in use)
+
+`[0..G1]` - A `CH_ARRAY` / `CH_SPARSE_ARRAY` starts with a `gamma`, indicating the size of the next item plus one.
+If this size value is zero, it indicates the end of the list.
+This indicates the full length of the next item minus one.
+In psuedo-code:
+
+```
+loop
+ size = read gamma - 1
+ if size == -1
+ break loop
+ read <size> bytes
+```
+
+`[]` - For `CH_ARRAY` there is an implicit index.
+The loop starts at zero, and every iteration adds one to the index.
+For entries in the game that were not allocated, the `size` will be zero.
+
+`[G1+1..G2]` - For `CH_SPARSE_ARRAY` there is an explicit index.
+The `gamma` following the size indicates the index.
+
+The content of the item is a binary blob, and similar to `CH_RIFF`, it depends on the tag of the chunk what it means.
+Please check the source-code for further details.
+
+### CH_TABLE / CH_SPARSE_TABLE
+
+(this chunk type only exists since savegame version 295)
+
+Both `CH_TABLE` and `CH_SPARSE_TABLE` are very similar to `CH_ARRAY` / `CH_SPARSE_ARRAY` respectively.
+The only change is that the chunk starts with a header.
+This header describes the chunk in details; with the header you know the meaning of each byte in the binary blob that follows.
+
+`[0..G]` - The header starts with a `gamma` to indicate the size of all the headers in this chunk plus one.
+If this size value is zero, it means there is no header, which should never be the case.
+
+Next follows a list of `(type, key)` pairs:
+
+- `[0..0]` - Type of the field.
+- `[1..G]` - `gamma` to indicate length of key.
+- `[G+1..N]` - Key (in UTF-8) of the field.
+
+If at any point `type` is zero, the list stops (and no `key` follows).
+
+The `type`'s lower 4 bits indicate the data-type (see chapter above).
+The `type`'s 5th bit (so `0x10`) indicates if the field is a list, and if this field in every record starts with a `gamma` to indicate how many times the `type` is repeated.
+
+If the `type` indicates either a `struct` or `str`, the `0x10` flag is also always set.
+
+As the savegame format allows (list of) structs in structs, if any `struct` type is found, this header will be followed by a header of that struct.
+This nesting of structs is stored depth-first, so given this table:
+
+```
+type | key
+-----------------
+uint8 | counter
+struct | substruct1
+struct | substruct2
+```
+
+With `substruct1` being like:
+
+```
+type | key
+-----------------
+uint8 | counter
+struct | substruct3
+```
+
+The headers will be, in order: `table`, `substruct1`, `substruct3`, `substruct2`, each ending with a `type` is zero field.
+
+After reading all the fields of all the headers, there is a list of records.
+To read this, see `CH_ARRAY` / `CH_SPARSE_ARRAY` for details.
+
+As each `type` has a well defined length, you can read the records even without knowing anything about the chunk-tag yourself.
+
+Do remember, that if the `type` had the `0x10` flag active, the field in the record first has a `gamma` to indicate how many times that `type` is repeated.