Problem with the python zipfile library if you share a file between linux and windows

647 Views Asked by At

The zipfile module is very interesting to manage .zip files with python.

However if the .zip file has been created on a linux system or macos the separator is of course '/' and if we try to work with this file on a Windows system there can be a problem because the separator is '\'. So, for example, if we try to determine the directory root compressed in the .zip file we can think to something like:

from zipfile import ZipFile, is_zipfile
import os

if is_zipfile(filename):

    with ZipFile(filename, 'r') as zip_ref:
        packages_name = [member.split(os.sep)[0] for member in zip_ref.namelist()
                         if (len(member.split(os.sep)) == 2 and not
                                                       member.split(os.sep)[-1])]

But in this case, we always get packet_name = [] because os.sep is "\" whereas since the compression was done on a linux system the paths are rather 'foo1/foo2'.

In order to manage all cases (compression on a linux system and use on a Windows system or the opposite), I want to use:

from zipfile import ZipFile, is_zipfile
import os

if is_zipfile(filename):

    with ZipFile(filename, 'r') as zip_ref:

        if all([True if '/' in el else
                False for el in zip_ref.namelist()]):
            packages_name = [member.split('/')[0] for member in zip_ref.namelist()
                             if (len(member.split('/')) == 2 and not
                                                       member.split('/')[-1])]

        else:
            packages_name = [member.split('\\')[0] for member in zip_ref.namelist()
                             if (len(member.split('\\')) == 2 and not
                                                           member.split('\\')[-1])]

What do you think of this? Is there a more direct or more pythonic way to do the job?

1

There are 1 best solutions below

0
On BEST ANSWER

Thanks to @snakecharmerb answer and to the reading of the link he proposed, I have just understood. Thank you @snakecharmerb for showing me the way ... In fact, indeed as described in the link proposed, internally zipfile uses only '/' and this independently of the OS used. As I like to see things concretely I just did this little test:

  • On a Windows OS I created with the usual means of this OS (not in command line) a file testZipWindows.zip containing this tree structure:

    • testZipWindows
      • foo1.txt
      • InFolder
        • foo2.txt
  • I did the same thing on a linux OS (and without also using a command line) for the testZipFedora.zip archive:

    • testZipFedora
      • foo1.txt
      • InFolder
        • foo2.txt

This is the result:

$ python3
Python 3.7.9 (default, Aug 19 2020, 17:05:11) 
[GCC 9.3.1 20200408 (Red Hat 9.3.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from zipfile import ZipFile
>>> with ZipFile('/home/servoz/Desktop/test/testZipWindows.zip', 'r') as WinZip:
...  WinZip.namelist()
... 
['testZipWindows/', 'testZipWindows/foo1.txt', 'testZipWindows/InFolder/', 'testZipWindows/InFolder/foo2.txt']
>>> with ZipFile('/home/servoz/Desktop/test/testZipFedora.zip', 'r') as fedZip:
...  fedZip.namelist()
... 
['testZipFedora/', 'testZipFedora/foo1.txt', 'testZipFedora/InFolder/', 'testZipFedora/InFolder/foo2.txt']

So it all lights up! We must indeed use os.path.sep to work properly in multiplatform but when we deals with zipfile library it is absolutely necessary to use '/' as separator and not os.sep (or os.path.sep). That was my mistake !!!

So the code to use in a multiplatform way for the example of my first post is just:

from zipfile import ZipFile, is_zipfile
import os

if is_zipfile(filename):

    with ZipFile(filename, 'r') as zip_ref:
        packages_name = [member.split('/')[0] for member in zip_ref.namelist()
                             if (len(member.split('/')) == 2 and not
                                                       member.split('/')[-1])]

And not all the useless things I had imagined...