I'm defining the following function to help calculate the midpoint of three or more points on Earth using the Vincenty formulae. The refs variable is defined globally and I've imported the pandas
, geodesic
from geopy.distance
and geographiclib
libraries:
def midpoint_vincenty(df):
global refs
global outputData
inputDict = {}
for col in df.columns:
inputList = []
for i, value in df[col].items():
if pd.notna(value):
inputLoc = df.at[i, col]
inputLat = refs.at[inputLoc, 'lat']
inputLong = refs.at[inputLoc, 'long']
inputCoord = (inputLat, inputLong)
inputList.append(inputCoord)
inputDict[col] = inputList
for i in inputDict:
pointsList = inputDict[i]
initDistance = geodesic(pointsList[0], pointsList[1]).km
initBearing = Geodesic.WGS84.Inverse(pointsList[0][0], pointsList[0][1], pointsList[1][0], pointsList[1][1])['azi1']
midpoint = geodesic(pointsList[0],pointsList[1]).destination(pointsList[0],initBearing, initDistance/2)
for point in pointsList[2:]:
iterDistance = geodesic(midpoint, point).km
iterBearing = Geodesic.WGS84.Inverse(midpoint[0],midpoint[1],point[0], point[1])['azi1']
midpoint = geodesic(midpoint, point).destination(midpoint, iterBearing, distance = iterDistance/2)
inputDict[i] = midpoint
outputData = pd.DataFrame(inputDict)
outputData = outputData.T[[0,1]]
return outputData
However, when I attempt to pass a data frame containing location names with a given latitude and longitude as the df
argument, it throws the following error message:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
The traceback shows the problem to be coming from the line where iterDistance
is defined, within the nested for loop at the bottom. This is weird though because the initDistance
line works absolutely fine using the exact same syntax. midpoint
is a geopy.point.Point object and point
is a tuple so I'm not sure it's either of those that are causing the issue as it says that the data type of the object causing the issue is a Series.
I've tried the following:
- Parsing the
point
value as a geopy.point.Point object - Calculating iterDistance using the
geographiclib
library method that I used foriterBearing
. This returns 'TypeError: cannot convert the series to class float' - Passing a smaller test data frame with fewer locations. This returned
outputData
with 0 errors, but I don't know which specific column in my initialdf
input is causing the issues as addingprint (midpoint)
after it's defined for the first time returns the first 33% of my values and I can't see anything wrong in the column after that (column no. 85 out of 237)
Any thoughts on this would be much appreciated.