How do I decompress data compressed with CSRCMPSC (Compression/Expansion) Macro or CMPSC mainframe instruction?

129 Views Asked by At

I'm working on a mainframe migration, and the input data file is compressed using the CMPSC instruction. I'm reading about how to compress/uncompress here: CSRCMPSC (Compression/Expansion) Macro and here: Compressing and Expanding data, but it doesn't go into any details. What I'm looking for is code (in any language) or an algorithm to uncompress the file that I can run on linux (my target language is Java). I see references to a manual ESA/390 Data Compression (SA22-7208) but I can't seem to find that either anywhere online. Any help would be appreciated!

1

There are 1 best solutions below

3
Mark Adler On

I found SA22-7208 here, where it is in "BKMGR" format, which seems to be an IBM-proprietary format for books. This is a Windows reader for that format. I was able to open the book and read it.

This may also help, at least with what a decompressor would need. It comes from PKWare's ZIP format appnote:

5.17 IBM z/OS CMPSC Compression - Method 16
-------------------------------------------

Method 16 utilizes the IBM hardware compression facility available
on most IBM mainframes.  Hardware compression can significantly 
increase the speed of data compression.  This method uses a variant 
of the LZ78 algorithm.  CMPSC hardware compression is performed
using the COMPRESSION CALL instruction.  

ZIP archives can be created using this method only on mainframes
supporting the CP instruction.  Extraction MAY occur on any
platform supporting this compression algorithm.  Use of this 
algorithm requires creation of a compression dictionary and
an expansion dictionary.  The expansion dictionary MUST be
placed into the ZIP archive for use on the system where
extraction will occur.

Additional information on this compression algorithm and dictionaries
can be found in the IBM provided document titled IBM ESA/390 Data 
Compression (SA22-7208-01). Storage requirements for using CMPSC 
compression are as follows.

The format for the compressed data stream placed into the ZIP
archive following the Local Header is:

    [dictionary header]
    [expansion dictionary]
    [CMPSC compressed data] 

If encryption is used to encrypt a file compressed with CMPSC, these 
sections MUST be encrypted as a single entity.

The format of the dictionary header is:

          Value            Size          Description
          -----            ----          -----------
          Version          1 byte        1
          Flags/Symsize    1 byte        Processing flags and
                                         symbol size
          DictionaryLen    4 bytes       Length of the 
                                         expansion dictionary

Explanation of processing flags and symbol size:

The high 4 bits are used to store the processing flags.  The low
4 bits represent the size of a symbol, in bits (values range
from 9-13).  Flag values are defined below.

    0x80 - expansion dictionary
    0x40 - expansion dictionary is compressed using Deflate
    0x20 - Reserved
    0x10 - Reserved