Is there a better way than Pandas apply?

53 Views Asked by At

I'm trying to find the distance from each point to the nearest shoreline.

I have two data.

  1. Latitude, longitude information for each point
  2. About the shoreline

ex) sample_Data (Point Data) =

        위도         경도
0   36.648365   127.486831
1   36.648365   127.486831
2   37.569615   126.819528
3   37.569615   126.819528
....

gdf =

0        LINESTRING (127.45000 34.45696, 127.44999 34.4...
1        LINESTRING (127.49172 34.87526, 127.49173 34.8...
2        LINESTRING (129.06340 37.61434, 129.06326 37.6...
...
def min_distance(x,y):
    sreach_point = Point(x,y)
    a =  gdf.swifter.progress_bar(enable=True).apply(lambda x : geod.geometry_length(LineString(nearest_points(x['geometry'], sreach_point))),axis = 1)
    return a.min()

sample_Data['거리']= sample_Data.apply(lambda x : min_distance(x['경도'],x['위도']),axis =1 ,result_type='expand')

This code takes longer than I thought, so I'm looking for a better way. If I cross join both data frames, will the speed increase?

It takes about 6 hours to proceed with the above code.

1

There are 1 best solutions below

1
On

you can use shapely for distance like :

import numpy as np
import pandas as pd
from shapely.geometry import Point, LineString

def min_distance(row, gdf):
    sreach_point = Point(row['경도'], row['위도'])
    a = gdf.swifter.progress_bar(enable=True).apply(
        lambda x: geod.geometry_length(LineString(nearest_points(x['geometry'], sreach_point))), axis=1
    )
    return a.min()

sample_Data['거리'] = gdf.apply(min_distance, axis=1, args=(sample_Data,))