Python subprocess.check_output conver to windows

388 Views Asked by At

I wrote some code that works fine on a Linux machine but does not run on windows.

import subprocess
import pandas as pd
try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO

def zgrep_data(f, string='', index='TIMESTAMP'):
    if string == '':
        out = subprocess.check_output(['zgrep', string, f])
        grep_data = StringIO(out)    
        data= pd.read_csv(grep_data, sep=',', header=0)

    else:
        col_out = subprocess.check_output(['zgrep', index, f])
        col_data = StringIO(col_out)
        columns = list(pd.read_csv(col_data, sep=','))

        out = subprocess.check_output(['zgrep', string, f])
        grep_data = StringIO(out)    
        data= pd.read_csv(grep_data, sep=',',names=columns, header=None)

    return data.set_index(index).reset_index()

I'm getting an error: FileNotFoundError: [WinError 2] The system cannot find the file specified

When I check it it with os.path.exists(file_path), it returns true. Any advice on how to modify this code so that it works on both Python 2 & 3 plus Windows and Linux would be appreciated.

1

There are 1 best solutions below

0
On

this message only means one thing: the executable could not be found.

this has nothing to do with your data file, since the process isn't even run.

And why that? because while zgrep is standard on Linux, it's a third party port on Windows, so you have to install it first from here

Note that if you only want to grep a string on csv files, it's overkill to use zgrep. It's much better to use a native python approach, reading lines (or rows, using the csv module) and matching patterns. You can even open .gz files natively. Then it will really be portable.