Kaitai struct - change default endianness based on a condition in the file

217 Views Asked by At

I'm dealing with files from two versions of a video game - one for the PC, one for the PS3. It's possible to tell which version of the game that a certain file comes from if the first four 4 bytes of the header - if struct.unpack_from("<f", data) says one number, it's from the PC, but if it doesn't, then struct.unpack_from(">f", data) should give that number. From there, the rest of the data is read accordingly.

I'm trying to write a parser for these files using Kaitai struct, but it seems like my options are to generate two separate KSY files for the LE and BE versions of the files, or two separate types, something like

seq:
  - id: sample_rate
    type: u4le
  - id: header
    type: header_le
    if: sample_rate == 1234
  - id: header
    type: header_be
    if: sample_rate == 4321


types:
  header_le:
    - id: sample_count
      type: u4le
    - id: channel_count
      type: u4le
  header_be:
    - id: sample_count
      type: u4be
    ...    

Either option works in the end, but I was hoping for something a bit less repetitive. Does Kaitai struct support this?

1

There are 1 best solutions below

2
Petr Pucil On BEST ANSWER

Affiliate disclaimer: I'm a Kaitai Struct maintainer (see my GitHub profile).

Does Kaitai struct support this?

Yes, see https://doc.kaitai.io/user_guide.html#calc-endian. In the top-level seq, you typically directly include only the field indicating the endianness, and the rest of the format (affected by the selected endianness) needs to be moved to a subtype where you will use meta/endian/{switch-on,cases}.

seq:
  - id: sample_rate
    type: u4le
  - id: header
    type: header_type

types:
  header_type:
    meta:
      endian:
        switch-on: _root.sample_rate
        cases:
          '0x0102_0304': le
          '0x0403_0201': be
    seq:
      - id: sample_count
        type: u4 # this will be parsed as 'le' or 'be' as decided in `meta/endian`
      - id: channel_count
        type: u4

Note that any user-defined types in which you want to inherit the endianness decided in /types/header_type/meta/endian must be defined somewhere under /types/header_type/types/.... It's suggested in the User Guide example (note the ifd type):

types:
  tiff_body:
    meta:
      endian:
        switch-on: _root.indicator
        cases:
          '[0x49, 0x49]': le
          '[0x4d, 0x4d]': be
    seq:
      - id: version
        type: u2
      # ...
    types:
      ifd:
        # inherits endianness of `tiff_body`

If you define them at the top level (same as header_type), they would not inherit the endianness from header_type and you'll probably get something similar to error: unable to use type 'u4' without default endianness.

For more examples, check out the .ksy specs in the format gallery that use it - image/exif.ksy, executable/elf.ksy or database/gettext_mo.ksy.