How to parallelize calculations of celestial bodies motion?

282 Views Asked by lazySeal At 01 April 2020 at 15:18

I have a piece of code which calculates positions of some satellites and planets using Skyfield. For clarity, I use Pandas DataFrame as a container of positions and corresponding time moments. I want to make calculation parallel, but always getting the same error: TypeError: can't pickle Satrec objects. Different parallelizers were tested, like Dask, pandarallel, swifter and Pool.map().

Example of piece of code to be parallelized:

        def get_sun_position(self, row):
            t = self.ts.utc(row["Date"]) # from skyfield
            pos = self.earth.at(t).observe(self.sun).apparent().position.m # from skyfield, error is here
            return pos

        def get_sat_position(self, row):
            t = self.ts.utc(row["Date"]) # from skyfield
            pos = self.sat.at(t).position.m # from skyfield, error is here
            return pos

        def get_positions(self):
            self.df["sat_pos"] = self.df.swifter.apply(self.get_sat_position, axis=1) # all the parallelization goes here
            self.df["sun_pos"] = self.df.swifter.apply(self.get_sun_position, axis=1) # and here

# the same implementation but using dask
#         self.df["sat_pos"] = dd.from_pandas(self.df, npartitions=4*cpu_count())\
#             .map_partitions(lambda df : df.apply(lambda row : self.get_sat_position(row),axis=1))\
#                 .compute(scheduler='processes')
#         self.df["sun_pos"] = dd.from_pandas(self.df, npartitions=4*cpu_count())\
#             .map_partitions(lambda df : df.apply(lambda row : self.get_sun_position(row),axis=1))\
#                 .compute(scheduler='processes')

For Dask to avoid Pickle I tried to set serializaton manually like this serializers=['dask', 'pickle'] but it didn't help.

As I understand, Skyfield uses sgp4 which contains Satrec class.

I would be wondering if there is some way to parallelize this .apply(). Or maybe I should not try Skyfield functions for parallel processing at all?

Original Q&A

There are 1 best solutions below

Brandon Rhodes On 02 April 2020 at 13:14

Alas, all of the mechanisms you are using to make the computation parallel do so by creating another process and then sending copies of all of the objects involved in the computation over to the other process — and the Satrec object is written in C++, not Python, to make it faster, and C++ objects have no native way to "serialize" themselves into bytes for transmission to another process. (Python objects have that ability built-in.)

Have you profiled your code to see what the most expensive steps are? My guess is that most of your expense is in the Sun computation, because to achieve its high precision Skyfield needs to compute the Earth's orientation to very high accuracy to give the Sun's position in the sky to high enough precision for even radio astronomers.

But if you yourself don't need that high an accuracy, you could switch to lower-precision sky coordinates for the Sun. Before using t in get_sun_position(), try doing this to it:

t._nutation_angles = iau2000b(t.tt)

That will use a lower precision estimate of the Earth's nutation (print out the values before and after this change to see how big the difference is, and compare that to how much inaccuracy your application can stand), but also hopefully run faster.

How to parallelize calculations of celestial bodies motion?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PARALLEL-PROCESSING

Related Questions in DASK

Related Questions in SKYFIELD

Related Questions in SGP4

Trending Questions

Popular # Hahtags

Popular Questions