Create Zipfile in Python in memory on Web Server

1.5k Views Asked by At

I'm working on a HTML WYSIWYG editor, and I am currently working on a 'Download' feature where the user can press a Download button to download a zip file of their theme. I am using a Python CGI script to achieve this feature. Currently, my script makes a zip file and prompts the user to Download it, but when I try to decompress the zip file, it only creates another zip file with the extension '.cpgz'. I believe that my script did not create the zip properly.

I am using the 'zipfile' module to create a zipfile object in memory instead of on Disk, the 'StringIO' module to create a filelike object in memory, and the 'cgi' module to accept POST data from an Ajax request.

My problem is in my for-loop. The 'zf' zip file isn't adding the files and sub-directories from the 'layoutDir' parameter I passed into os.walk(). The script will prompt the browser to download the zip file, but I am unable to decompress it.

#!/usr/bin/python

import sys
import os
import zipfile
import StringIO
import cgitb

cgitb.enable()

layoutDir = 'http://localhost:8888/funWYSIWYG/public/views/layouts/Marketing'

tmpZip = StringIO.StringIO() 
zf = zipfile.ZipFile(tmpZip, 'w', zipfile.ZIP_DEFLATED)

for root, dirs, files in os.walk(layoutDir):
    for name in files:
        absfn = os.path.join(root, name)
        relfn = absfn[len(layoutDir) + len(os.sep):]
        zf.write(absfn, relfn)

zf.close()

sys.stdout.write("Content-Type: application/octet-stream\r\n")
sys.stdout.write("Content-Disposition: attachment; filename=\"funWYSIWYG-Marketing.zip\"\r\n\r\n")

sys.stdout.write(tmpZip.getvalue())

# Close opened file
tmpZip.close()

UPDATE 1: I got rid of some of the irrelevant stuff that was in my code. I also correct the typo with 'absfn' and 'adsfn'. The code above now represents exactly what I have in my local code editor. I am still having the same problem with being unable to decompress the zip file that is made.

UPDATE 2: Here is what the 'Marketing' directory looks like on my computer.

|---- Marketing
      |---- css
      |    |---- default.css
      | 
      |---- img
      |
      |---- index.html
1

There are 1 best solutions below

9
On

If this is your actual code, your problem is a simple typo:

zf.write(adsfn, relfn)

You don't have a variable named adsfn, you have one named absfn. So this will raise a NameError and fail to return anything.

If I fix that, then run this code with layoutDir set to a reasonable relative path with some kind of hierarchy in it, and store the resulting in-memory zipfile to disk like this:

with open('foo.zip', 'wb') as f:
    f.write(tmpZip.getvalue())

… then I end up with a zipfile with all of the files stored in it properly, meaning there is no problem at all.

So, if that typo isn't your problem, then whatever is your problem appears to be some other difference between the code you posted and the actual code we can't see.


It seems like your actual problem is that in your real code, you're trying to use an absolute or relative URL, like http://localhost:8888/path-to/Marketing, as the layoutDir. URLs and paths aren't the same thing. If you try to use that as a path one, it will be effectively the same as ./http:/localhost:8888/path-to-Marketing. You almost certainly don't have a directory named http: in the current working directory, so os.walk will just yield nothing, meaning you'll end up creating an empty zip file.

If the files you're trying to add are actually available at some (relative or absolute) path, use that path in place of the URL, and your problem will go away.

If they're only available via HTTP, then what you're trying to do is impossible; there is no way to walk the "subdirectories" of an HTTP URL; the concept isn't even meaningful. In many cases, of course, web servers map parts of the URL path to some filesystem path and provide some way to navigate that filesystem indirectly (e.g., by having an option to auto-generate an index.html page full of links), but to take advantage of that you need to know exactly how the server in question is exposing that information, then writing scraping code to take advantage of it. And then, even once you have all the links, you can't pass a URL to zipfile.write, only a file. Which means you have to download each URL (e.g., read it into memory and then writestr the result).