RS-Code on schifra library - how set up polynommial?

899 Views Asked by At

I am currently trying to get the schifra library running for making some tests to implement it later in my code.

I am currently looking at the schifra_reed_solomon_example02.cpp and try to understand how I have to set the values to suite my needs.

/* Finite Field Parameters */
   const std::size_t field_descriptor                 =   8; // GF(2^8) ok
   const std::size_t generator_polynommial_index      = 120; // what is this?
   const std::size_t generator_polynommial_root_count =  32; // polynomial up to x^32

   /* Reed Solomon Code Parameters */
   const std::size_t code_length = 255;  // amount of symbols in codeword
   const std::size_t fec_length  = 32;  // minimal distance d ?
   const std::size_t data_length = code_length - fec_length; // amount of symbols my message has

So what I try to have is an RS-Code for n, k , d = (128, 16, 113)

And I would proceed the following:

/* Finite Field Parameters */
   const std::size_t field_descriptor                 =   8; // I want GF(2^8)
   const std::size_t generator_polynommial_index      = 120; // still not knowing
   const std::size_t generator_polynommial_root_count =  16; // because polynomial up to 16

   /* Reed Solomon Code Parameters */
   const std::size_t code_length = 128;  // 128 byte codewords
   const std::size_t fec_length  = 113;  // minimal distance, 113 because d = n - k +1
   const std::size_t data_length = 16; 

I then receive at encoding a mesage an error.

schifra::galois::field_polynomial generator_polynomial(field);

   schifra::sequential_root_generator_polynomial_creator(field,
                                                         generator_polynommial_index,
                                                         generator_polynommial_root_count,
                                                         generator_polynomial);

   /* Instantiate Encoder and Decoder (Codec) */
   schifra::reed_solomon::encoder<code_length,fec_length> encoder(field,generator_polynomial);
   schifra::reed_solomon::decoder<code_length,fec_length> decoder(field,generator_polynommial_index);

   std::string message = "A professional i"; // its 16 bytes
   std::cout << "Original Message:   [" << message << "]" << std::endl;
  message = message + std::string(data_length - message.length(),static_cast<unsigned char>(0x00)); // this one is also done in example

   std::cout << "Original Message:   [" << message << "]" << std::endl;
   std::cout << "Message length: " << message.length() << std::endl; // still 16 bytes

   /* Instantiate RS Block For Codec */
   schifra::reed_solomon::block<code_length,fec_length> block;

   /* Transform message into Reed-Solomon encoded codeword */
   if (!encoder.encode(message,block))
   {
      std::cout << "Error - Critical encoding failure!" << std::endl;
      return 1;
   }

The Error - Critical encoding failure! is then given.

I think what I do wrong is setting up the polynommial correct - maybe someone can help me?

2

There are 2 best solutions below

0
On

I am a beginner who tried to use the Schifra library. I didn't know anything about Reed Solomon codes and after weeks of digging into source code, I found a combination that always works. I still don't know any of the math that the RS codes use.

Some jargon that might be useful:

  • Symbol: Smallest quantum of information that the RS code will recognize(generally set to 8 bits(1 byte))
  • FEC: It is the redundancy or the error correction "symbols" which are appended at the end of your data
  • Data: The data you want to generate RS codes for.
  • Galois field: A polynomial or a mapping that the Reed Solomon code uses to do error correction (I don't know any math that goes behind RS codes)

The combination that works and making some sense out of the initializing variables without delving into the math:

  • Field descriptor= Size of a symbol (generally set to 8 bits in length)
  • Generator polynomial index= No clue what this is but setting it to 0 always works and gives a small performance boost compared to other values.
  • Generator Polynomial Root Count= Needs to be equal to FEC length
  • Code length= This value HAS to be (2^(symbol size or Field Descriptor)-1). It denotes the number of symbols your final encoded string (Data symbols+ Error Corr Symbols). Note that the total size of the encoded string in bits equals Code length* symbol size.
  • FEC length: Number of error correction symbols.
  • Data length: Number of data symbols. (As initiated in examples as Code_length - FEC_length) this should not be touched.

It should be noted that for a particular symbol size the code length is fixed and one can vary what percent of that code length will be data and what will be error correction symbols. And as common sense suggests, more error correction symbols leads to more error correction capability.

On running the encoder, you pass a block object along with the data as arguments.

On running the method encoder.encode the "block" is populated with the encoded data(Data + ECC).

After corruption, decoding is done. The number of errors, erasures which can be corrected by the decoder is explained in this link

The decoder takes the block and the erasure list(if applicable) as arguments and returns the original message/data if the number of erasure and errors were within the range of the ECC(look at the link).

0
On

Reed-Solomon encoding is not meant to work with
your parameters, independent of the program code.

You can choose following things:

  • A B, that is how many bit one data unit has (eg. B=8 for normal byte data).
  • A T with 0 <= T < (2^b)/3: Larger Ts mean better error correction, but less encoding speed.
  • Some irreducible polynomial and base power, not important here.

You can NOT choose:

  • GF/polynomial base is 2^B, nothing else.
  • Each plaintext block (without RS checksum data) has to be
    (2^B - 2*T - 1) data units (here bytes), nothing else.
  • For each plaintext block, RS will additionally calculate 2*T data units checksum data.
  • Thus, a block with plaintext and checksum data together has 2^B -1 data units.
    This means, with bytes as data units, such a block has exactly 255 byte, nothing else.
  • When decoding again, up to T wrong data units can be corrected
    (the errors may be in plaintext and/or checksum part, doesn't matter)
  • Up to 2*T wrong data units can be recognized, but necessarily corrected.
  • With more than 2*T errors, all bets are off.

In the example code (partially):

const std::size_t field_descriptor = 8;
const std::size_t code_length = 255; 
const std::size_t fec_length = 32;
const std::size_t data_length = code_length - fec_length;  
  • field_descriptor is B.
  • code_length is a full block length (plaintext + checksum).
  • fec_length is a checksum length.
  • data_length is a plaintext block length.

So for each 233 plaintext byte, RS calculates 32 checksum byte
(=255 total byte), to be able to correct up to 16 wrong byte later.

In your code, you choose GF(2^8) and byte data, so B=8, thus code_length 128 is wrong. It has to be 255. The checksum part length 113 is too much and not even, and the plaintext length isn't the difference of the previous values.
...

Other than that, of 6 RS codes I know in the internet, Shifra is by far the worst (very slow despite being praised as "highly optimzed", quirky code, problematic license, ...). Choose anything, but not Schifra.

I'll check if I can share something I wrote ... Including learning about the inner workings of the algorithm, writing everything (encoder+decoder) took not even a day, and it was 3 times faster than Schifra on the test machine back then (no asm, just C++). The downside is that is can only work with byte-based data, ie. B=8, without being able to choose other bases. But probably you don't need that anyways.