I'm defining the following function to help calculate the midpoint of three or more points on Earth using the Vincenty formulae. The refs variable is defined globally and I've imported the pandas, geodesic from geopy.distance and geographiclib libraries:
def midpoint_vincenty(df):
global refs
global outputData
inputDict = {}
for col in df.columns:
inputList = []
for i, value in df[col].items():
if pd.notna(value):
inputLoc = df.at[i, col]
inputLat = refs.at[inputLoc, 'lat']
inputLong = refs.at[inputLoc, 'long']
inputCoord = (inputLat, inputLong)
inputList.append(inputCoord)
inputDict[col] = inputList
for i in inputDict:
pointsList = inputDict[i]
initDistance = geodesic(pointsList[0], pointsList[1]).km
initBearing = Geodesic.WGS84.Inverse(pointsList[0][0], pointsList[0][1], pointsList[1][0], pointsList[1][1])['azi1']
midpoint = geodesic(pointsList[0],pointsList[1]).destination(pointsList[0],initBearing, initDistance/2)
for point in pointsList[2:]:
iterDistance = geodesic(midpoint, point).km
iterBearing = Geodesic.WGS84.Inverse(midpoint[0],midpoint[1],point[0], point[1])['azi1']
midpoint = geodesic(midpoint, point).destination(midpoint, iterBearing, distance = iterDistance/2)
inputDict[i] = midpoint
outputData = pd.DataFrame(inputDict)
outputData = outputData.T[[0,1]]
return outputData
However, when I attempt to pass a data frame containing location names with a given latitude and longitude as the df argument, it throws the following error message:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
The traceback shows the problem to be coming from the line where iterDistance is defined, within the nested for loop at the bottom. This is weird though because the initDistance line works absolutely fine using the exact same syntax. midpoint is a geopy.point.Point object and point is a tuple so I'm not sure it's either of those that are causing the issue as it says that the data type of the object causing the issue is a Series.
I've tried the following:
- Parsing the
pointvalue as a geopy.point.Point object - Calculating iterDistance using the
geographicliblibrary method that I used foriterBearing. This returns 'TypeError: cannot convert the series to class float' - Passing a smaller test data frame with fewer locations. This returned
outputDatawith 0 errors, but I don't know which specific column in my initialdfinput is causing the issues as addingprint (midpoint)after it's defined for the first time returns the first 33% of my values and I can't see anything wrong in the column after that (column no. 85 out of 237)
Any thoughts on this would be much appreciated.