I have a file that was formerly an EBCDIC-encoded file, which was converted to ASCII using dd. However, some lines contain COMP-3 packed fields which I would like to read.
For example, the string representation of one of the lines I would like to decode is:
'15\x00\x00\x00\x04@\x00\x00\x00\x00\x0c\x00\x00\x00\x00\x0c777093020141204NNNNNNNNYNNNN\n'
The field I would like to read is specified by PIC S9(09) COMP-3 POS. 3, that is, the field that starts with the third byte and is nine bytes long when decoded (and therefore, five bytes long when encoded, according to the COMP-3 spec).
I understand the COMP-3 spec and I also know that for this particular line the integer value of this field should be 315, but I can't figure out what to do in order to actually decode the field. I'm also not sure if the fact that the file was converted with dd to ASCII is a problem here or not.
Has anyone worked on a similar issue before, or is there something obvious I'm missing? Thank you!
Yes, it is a problem that a file contains non-character data and has been converted from EBCDIC to ASCII at the file or record-level. It is not a problem what tool has been used to do that.
By far the easiest thing for you is to request that the data be given to you in character-only. Where the data contains signed fields, the sign should be separate, and where there are implied decimal places these should be actual, or indicated by a scaling value (whichever is more convenient to you).
Then you need to convert nothing. I can never understand how people think they can just give you EBCDIC data containing "whatever" and expect you to sort it out.
If you click on the EBCDIC tag you will find some other solutions you may be able to apply if, for some idiotic reason, the character data cannot be made available from the EBCDIC source. Since they've given you crap already, they may be able to come up with some moronic reason. If so, document it (politely) to your boss.
If you get character data, then you can dd or whatever to convert it (if you still get funny-looking stuff, check the code-pages).
The reason things get pickled if you convert non-character data is exemplified by this:
Both of those, in EBCDIC, have the hexadecimal value
5C. Both will be converted to an ASCII asterisk. The COMP-3 value of five has then been lost. Note that a COMP-3 can, outside of the low-order sign, take any pair of numeric digits for each of its bytes. Pickle when you happen to hit a control character. Same for "binary" fields, worse indeed because more possibilities of accidental hit.