I have a file that was formerly an EBCDIC-encoded file, which was converted to ASCII using dd. However, some lines contain COMP-3 packed fields which I would like to read.
For example, the string representation of one of the lines I would like to decode is:
'15\x00\x00\x00\x04@\x00\x00\x00\x00\x0c\x00\x00\x00\x00\x0c777093020141204NNNNNNNNYNNNN\n'
The field I would like to read is specified by PIC S9(09) COMP-3 POS. 3
, that is, the field that starts with the third byte and is nine bytes long when decoded (and therefore, five bytes long when encoded, according to the COMP-3 spec).
I understand the COMP-3 spec and I also know that for this particular line the integer value of this field should be 315
, but I can't figure out what to do in order to actually decode the field. I'm also not sure if the fact that the file was converted with dd
to ASCII is a problem here or not.
Has anyone worked on a similar issue before, or is there something obvious I'm missing? Thank you!
If the reverse character encoding conversion were to be performed, then the value may be able to be determined; because there is [good reason to] doubt to that effect, the best thing to do is as Bill Woodger suggested and get a new copy of the data in a text format, or get a new copy of the original data but do not corrupt the data with a character translation of the inherently binary [portions of the] data. In this specific case, I am confident the value is determinable; but as 0d377 (+377) rather than 0d315 (+315).
Hopefully sense can be made of the following:
ASCII string (given\xEncoded):
ASCII (hex):
EBCDIC:
The bytes of data in the
PIC S9(09) COMP-3 POS. 3
that are the Packed Binary Coded Decimal (BCD), for five bytes from positions five to fourteen [in the scale lines shown; ten hex digits000000377C
], represent the positive decimal integer value377
. I have little doubt, that was the original value.By chance, the conversion from EBCDIC to ASCII, for that particular string, was not corrupted due to an inability to round-trip the character conversion. The next two values in the record are also presumably defined the same, and those too are unaffected by data loss in a conversion both to and from EBCDIC; i.e. the control character with code-point x0C is the same in both EBCDIC and ASCII, and both have the decimal value of positive zero.
While there may have been other possible Code Page from which to try the round-trip, the CP00037 provided a strong contender [with x7C with a valid sign nibble] and a valid conversion; the value of
315
seems quite improbable as the reserved EBCDIC control character x31 would have had to translate into ASCII x04 instead of either x91 or xBA, and the most likely EBCDIC x5C inexplicably would have had to convert to ASCII x40 instead of into x2A [or as a negative value x5D inexplicably convert to ASCII x40 instead of into x29; any non-preferred signage possibilities were not contemplated], neither of which makes any sense.