Can I use my existing native python code (non-pyspark code) in spark to use its fast processing and distributed feature? I do not want to edit my existing python code to make it pyspark code... and just want to run it as it-is in spark(stand-alone)?? is it possible to do using spark-submit or any other way so that I can avail spark and run my non-spark python code? I would really appreciate anyone's help/steps to overcome this problem?
TIA.
P.S: I am trying to do spark-submit on a linux server(having spark installed) but unable to achieve this
Example, abc.py is a python script having non-pyspark, native python code I cannot make changes to the code but want to run the above python file in spark to use its distributive compute, can I do that using spark-submit or any other way?? Note: I can not make any changes to the python file I have and this python file has no code written in pyspark
Without using RDD's or rather DataFrames in the main these days, no parallelization will occur. Nor for pandas dataframe.
That is to say, no point in running on Spark. Can run it on Databricks of course and minimize platforms.