Scodec - Reading in a fixed-length String

131 Views Asked by At

I'm writing a file parser that is reading an existing file format that incorporates fixed length, 0 padded strings.

So, for example I've got two case classes for binary structures within the file I need to parse. The first includes a 4-character string that can be one of two values and the latter includes an 8 character string (where values < 8 characters in length are NUL padded)

case class WadHeader( magic : String, items : Int, dirOffset : Int)
case class LumpIndex( offset : Int, size : Int, lumpName : String)

I've tried to write a simple codec to parse the first:

  implicit val headerCodec : Codec[WadHeader] = {
    ("magic" | bytes(4)) ::
      ("items" | uint32) ::
      ("dirOffset" | uint32)
  }.as[WadHeader]

However, I'm finding that it can't successfully transform this into a WadHeader (presumably because the magic value does not completely match up with the case-class definition. I'd like to be able to ingest a fixed-size string of bytes and decode it into a String object.

Unfortunately, scouring over the documentation only turns up the 'greedy' string, or size prefixed string options.

1

There are 1 best solutions below

0
On BEST ANSWER

Ok - so I've figured out a solution that works ok. There's probably a simpler/cleaner way to do it, but this works pretty well.

Firstly, I define a new fixedString codec for reading in strings when I know the length in advance:

  def fixedString(size: Int): Codec[String] = new Codec[String] {
    private val codec = fixedSizeBytes(size, ascii)
    def sizeBound: SizeBound = SizeBound.exact(size * 8L)
    def encode(b: String): Attempt[BitVector] = codec.encode(b)
    def decode(b: BitVector): Attempt[DecodeResult[String]] = {
      codec.decode(b) match {
        case Successful(DecodeResult(value, remainder)) =>
          val decoded = value.toSeq.takeWhile(_>0).mkString

          Attempt.successful(DecodeResult(decoded, remainder))
        case fail : scodec.Attempt.Failure => fail
      }
    }
    override def toString = s"fixedString($size)"
  }

That works for the string. The second was just a silly mistake on my part (uint32 decodes to a Long, not an Int), which required me to update my case class definition accordingly:

case class WadHeader( magic : String, items : Long, dirOffset : Long)

object WadHeader {
  implicit val codec : Codec[WadHeader] = {
    ("magic" | fixedString(4)) ::
      ("items" | uint32) ::
      ("dirOffset" | uint32)
  }.as[WadHeader]
}

EDIT: 5/7 - Figured out that I can wrap fixedSizeCodec(size, ascii) instead of bytes and it does most of what I want and have updated the solution accordingly. Depending on the requirements, fixedSizeCodec(size, cstring) could be a really good solution too - however for my use case that fails for strings that use the full field length as there is no room for the terminating nul.