I have a data-frame consisting of Time Series.
Date Index | Time Series 1 | Time Series 2 | ... and so on
I have used pyRserve to run a forecasting function using R.
I want to implement parallel processing using celery. I have written the worker code in the following context.
def pipeR(k #input variable):
conn = pyRserve.connect(host = 'localhost', port = 6311)
# OPENING THE CONNECTION TO R
conn.r.i = k
# ASSIGNING THE PYTHON VARIABLE TO THAT OF IN THE R ENVIRONMENT
conn.voideval\('''
WKR_Func <- forecst(a)
{
...# FORECASTS THE TIMESERIES IN COLUMN a OF THE DATAFRAME
}
''')
conn.eval('forecst(i)')
# CALLING THE FUNCTION IN R
group(pipeR.s(k) for k in [...list of column headers...])()
To implement parallel processing, can I have a single port for all the worker process (like I did in the above code, port:6311) or should I have different ports for different worker processes ??
I'm currently getting an error
Error in socketConnection("localhost", port=port, server=TRUE, blocking=TRUE, : cannot open the connection
in R.
The problem got resolved when I opened different ports for each worker process...