From the docs, it looks like it's possible to perform selective file extract and open using the zipfile module in native python, http://docs.python.org/2/library/zipfile using
ZipFile.extract(member[, path[, pwd]])
Extract a member from the archive to the current working directory; member must be its full name or a ZipInfo object). Its file information is extracted as accurately as possible. path specifies a different directory to extract to. member can be a filename or a ZipInfo object. pwd is the password used for encrypted files.
I have a zipfile as such foobar.zip
:
foobar.zip\
\foo
\a.txt
\b.txt
\bar
\b.txt
\c.txt
I've tried to extract files from a single sub-directory of the .zip file but it prints nothing sometimes:
import zipfile
with zipfile.ZipFile('foobar.zip','r') as inzipfile:
for infile in inzipfile.namelist():
if 'foo' in os.path.split(infile)[0]:
print inzipfile.open(infile,'r').read()
I've tried to give a list of selected files that i might want to extract but it also prints nothing sometimes too.
wanted = ['a.txt', 'b.txt']
import zipfile
with zipfile.ZipFile('foobar.zip','r') as inzipfile:
for infile in inzipfile.namelist():
if os.path.split(infile)[1] in wanted:
print inzipfile.open(infile,'r').read()
Edited:
There's nothing wrong with the code or how I'm reading the files. I think there's something wrong with my zipfile which causes schroedinbug where sometimes my sub-directory files don't open and inzipfile.open(infile,'r').read()
returns None. Now it extracts, opens and print the content of the file.
Any idea how to check within the python code, that all files in a .zip file can be opened with the selective extract/open method above?
How else can I perform selective extract/open of zipfiles? Is there a more pythonic method?
There is something wrong with your code. It's opening and reading the folder names which are also in
inzipfile.namelist()
. You can see this by simply:Which will output:
Another way to see it is with
inzipfile.printdir()
which should result in something along the following lines being printed:Notice that in both cases the name of all folder entries end with a
/
character. You can use that as a simple way to detect them:Likewise:
The only way I can think of to check if all the [file] members of an archive can be opened, is to actually try doing it to each one: