Create zip from inline binary attachment to multi-part message

1.3k Views Asked by At

As part of the eBay API bulk upload methods we receive a multi-part response from eBay (supposedly) containing the raw data of a zip file containing an XML file. We're having issues converting this from its raw binary form into the zip file. This is an example of the ebay response with the zip/xml document at the bottom of the multi-part message.

This is some quick (and dirty) PHP we've been using to test the response:

$fpath = "http://developer.ebay.com/DevZone/file-transfer/CallRef/Samples/downloadFile_basic_out_xml.txt";
$responseXml = file_get_contents($fpath);
$endofxmlstring = "</downloadFileResponse>";
$pos = strpos($responseXml, $endofxmlstring) + 1; //plus one to catch the final return
$zipbuffer = substr($responseXml, $pos + strlen($endofxmlstring));
unset($responseXml);

$startofzipstring = "Content-ID:";
$pos = strpos($zipbuffer, $startofzipstring);
$zipbuffer = substr($zipbuffer, $pos);

$startofzipstring = "PK";
$pos = strpos($zipbuffer, $startofzipstring);
$zipbuffer = substr($zipbuffer, $pos);

$handler = fopen("response.zip", 'wb') or die("Failed. Cannot Open file to Write!");
fwrite($handler,$zipbuffer);
fclose($handler);

The zip file is created, but it is corrupt. The content passed to the zip file in $zipbuffer appears to be the correct code (in as much as it is identical to the code at the bottom of the response content) so i'm not sure whats going on.

The ebay docs here describe what gets returned here:

The output sample shows the raw format of the download file response to illustrate how the data file is attached in the multi-part message. The root part (or body) contains the call response with the standard output fields, such as ack, timestamp, and version. The final part contains the compressed file attachment in base64binary format. The file attachment stream is referenced by content ID (i.e., cid) in the Data field of the body. When the ack value is "Success," the binary data of the file attachment must be saved as a zip file. The SoldReport XML file must, in turn, be extracted from the zip file.

It mentions the returned content is "base64binary" but what actually is this? It's certainly not a base64 string that i've worked with before.

1

There are 1 best solutions below

0
On

It mentions the returned content is "base64binary" but what actually is this? It's certainly not a base64 string that i've worked with before.

It mentions that inside the XML that is. But keep in mind that the XML is inside the ZIP and then the ZIP is the last part of the multipart response (HTTP message).

Okay, that might sound now a bit like clever-shitting, here is a good way to remind this: base64binary is most often used in XML contexts as XML can not contain binary data full (NUL bytes for example don't work and we know binary data can contain them, as some other chars aren't supported). So if you spot base64binary and XML is around the corner, it's not wrong to assume both belong together.

And for the HTTP example given you're totally right: There is not base64 in there:

...
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary
                           ######
Content-ID: <urn:uuid:D8D75F18A8343F8FC61226972901992>

PKÙÔG²x7œÿwšÌÐÛ?žû›ÚE0uRßÔçÒ©]SŒçÔU mSkèSkèS«·SÏ[M=o•Z¿N­_§þ:Kýu–úë,õÌ]
ê[ÈS'%¦¾Ù'uTcjGêÁÏÔ$IjKjKjKê¸ÎÔóV©ôÔzê?¯Ôdij²4uF\6݈ôÌ]jIjÂ<µ‹#õÕB©¯J=
ö˜:¨0».C-åiÙèl¢Ijå(õÜ_jÆ>5cŸ:(/µ—&õØ]jÉ µd?ú^›Ô9?©‡þRý¥NJLí©Kí©Kí©K-¦–K‡cÃÒáØ0W¹

The transfer encoding is clearly binary here.

You should use a HTTP client here that is able to de-chunk the chunked response and also deals well with multipart-responses.

The

$startofzipstring = "PK";
$pos = strpos($zipbuffer, $startofzipstring);
$zipbuffer = substr($zipbuffer, $pos);

Will likely fail if a the last part is chunked.


The sample data you provide via Ebay is somewhat broken so this was not that easy to test, but if you install the HTTP extension of PHP it is somewhat simple to deal with multipart documents. This might not be 100% RFC conform but I think this is pretty OK for that little amount of code and more strict then the other examples I could find on Stackoverflow with a quick search:

$url = 'http://developer.ebay.com/DevZone/file-transfer/CallRef/Samples/downloadFile_basic_out_xml.txt';
$raw = file_get_contents('downloadFile_basic_out_xml.txt');

$message = MultipartHttpMessage::fromString($raw);

echo 'Boundary: ', $message->getBoundary(), "\n";

foreach ($message->getParts() as $index => $part) {
    printf("Part #%d:\n", $index);
    foreach ($part->getHeaders() as $name => $value) {
        printf("  %s: %s (%s)\n", $name, $value[NULL], $value);
    }
}

Output:

Boundary: MIMEBoundaryurn_uuid_9ADF5C1A6F530C078712269728985463257
Part #0:
  Content-Type: application/xop+xml (application/xop+xml; charset=utf-8; type="text/xml")
  Content-Transfer-Encoding: binary (binary)
  Content-Id: <0.urn:uuid:9ADF5C1A6F530C078712269728985463258> (<0.urn:uuid:9ADF5C1A6F530C078712269728985463258>)
Part #1:
  Content-Type: application/octet-stream (application/octet-stream)
  Content-Transfer-Encoding: binary (binary)
  Content-Id: <urn:uuid:D8D75F18A8343F8FC61226972901992> (<urn:uuid:D8D75F18A8343F8FC61226972901992>)

Code: https://gist.github.com/hakre/f13e1d633301bf5f221c