Create line network from closest points with boundaries

80 Views Asked by At

I have a set of points and I want to create line / road network from those points. Firstly, I need to determine the closest point from each of the points. For that, I used the KD Tree and developed a code like this:

def closestPoint(source, X = None, Y = None):

df = pd.DataFrame(source).copy(deep = True) #Ensure source is a dataframe, working on a copy to keep the datasource

if(X is None and Y is None):
    raise ValueError ("Please specify coordinate")
elif(not X in df.keys() and not Y in df.keys()):
    raise ValueError ("X and/or Y is/are not in column names")
else:
    df["coord"] = tuple(zip(df[X],df[Y])) #create a coordinate

if (df["coord"].duplicated):
    uniq = df.drop_duplicates("coord")["coord"]
    uniqval = list(uniq.get_values())
    dupl = df[df["coord"].duplicated()]["coord"]
    duplval = list(dupl.get_values())

    for kq,vq in uniq.items():
        clstu = spatial.KDTree(uniqval).query(vq, k = 3)[1]
        df.at[kq,"coord"] = [vq,uniqval[clstu[1]]]
        if([uniqval[clstu[1]],vq] in list(df["coord"]) ):
            df.at[kq,"coord"] = [vq,uniqval[clstu[2]]]

    for kd,vd in dupl.items():
        clstd = spatial.KDTree(duplval).query(vd,k = 1)[1]
        df.at[kd,"coord"] = [vd,duplval[clstd]]
else:
    val = df["coord"].get_values()
    for k,v in df["coord"].items():
        clst = spatial.KDTree(val).query(vd, k = 3)[1]
        df.at[k,"coord"] = [v,val[clst[1]]]
        if([val[clst[1]],v] in list (df["coord"])):
            df.at[k,"coord"] = [v,val[clst[2]]]

return df["coord"]

The code can return the the closest points around. However, I need to ensure that no double lines are created (e.g (x,y) to (x1,y1) and (x1,y1) to (x,y)) and also I need to ensure that each point can only be used as a starting point of a line and an end point of a line despite the point being the closest one to the other points.

Below is the visualization of the result: Result of the code

What I want: What I want

I've also tried to separate the origin and target coordinate and do it like this:

df["coord"] = tuple(zip(df[X],df[Y])) #create a coordinate
df["target"] = "" #create a column for target points

count = 2 # create a count iteration
if (df["coord"].duplicated):
  uniq = df.drop_duplicates("coord")["coord"]
  uniqval = list(uniq.get_values())
  for kq,vq in uniq.items():
    clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
    while not vq in (list(df["target"]) and list(df["coord"])):
        clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
        df.set_value(kq, "target", uniqval[clstu[count-1]])
    else:
        count += 1
        clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
        df.set_value(kq, "target", uniqval[clstu[count-1]])

but this return an error

IndexError: list index out of range

Can anyone help me with this? Many thanks!

2

There are 2 best solutions below

3
On BEST ANSWER

Answering now about the global strategy, here is what I would do (rough pseudo-algorithm):

current_point = one starting point in uniqval
while (uniqval not empty)
  construct KDTree from uniqval and use it for next line
  next_point = point in uniqval closest to current_point
  record next_point as target for current_point
  remove current_point from uniqval
  current_point = next_point

What you will obtain is a linear graph joining all your points, using closest neighbors "in some way". I don't know if it will fit your needs. You would also obtain a linear graph by taking next_point at random...

3
On

It is hard to comment on your global strategy without further detail about the kind of road network your want to obtain. So let me just comment your specific code and explain why the "out of range" error happens. I hope this can help.

First, are you aware that (list_a and list_b) will return list_a if it is empty, else list_b? Second, isn't the condition (vq in list(df["coord"]) always True? If yes, then your while loop is just always executing the else statement, and at the last iteration of the for loop, (count-1) will be greater than the total number of (unique) points. Hence your KDTree query does not return enough points and clstu[count-1] is out of range.