Intersection of hyperbolae in R based on line equations

185 Views Asked by At

I am trying to find a method for finding the point of intersection of 2 hyperbolae in R. Single-branch hyperbloae can be described by the following equation:

equation image here

or

enter image description here

where (xi, yi) and (xj, yj) are the coordinates of 2 foci (i and j), and r is the difference in distance between a given point on the hyperbola (x, y) and each of the foci.

It seems the best way to visualise a hyperbola using R is to visualise contour lines (when contour level = 0) of a 3D hyperboloid, determined using the above equation and implementing it into a function

f1 <- function(xi, yi, xj, yj, x, y, r){
  sqrt((xj - x)^2 + (yj - y)^2) - sqrt((xi - x)^2 + (yi - y)^2) - r
  }

As an example, we can build a grid and visualise the contour level 0 for both hyperbolae:

library(tidyverse)
library(sf)

# sample points
tribble(
  ~name, ~x, ~y,
  'a', 25, 25,
  'b', 75, 25,
  'c', 50, 75,
) %>%
  {. ->> sample_points}

# sample hyperbolae
expand.grid(
  x = seq(0, 100, length = 100),
  y = seq(0, 100, length = 100)
) %>%
  as_tibble %>%
  mutate(
    z1 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 5), # i = point a, j = point b
    z2 = f1(xi = 25, yi = 25, xj = 50, yj = 75, x, y, r = 5)  # i = point a, j = point c
  ) %>%
  {. ->> hyp_data} %>%
  ggplot()+
  geom_contour(aes(x, y, z = z1), breaks = 0, colour = 1)+
  geom_contour(aes(x, y, z = z2), breaks = 0, colour = 2)+
  geom_point(data = sample_points, aes(x, y, color = name), size = 3)+
  coord_sf()

enter image description here

Currently, I can find the point of intersection by extracting the line data from the two geom_contour lines, using ggplot_build(), converting the line coordinates to an sf LINESTRING, and use st_intersection to find the point of intersection:

# extract the data from the ggplot using ggplot_build()
ggplot_build(

  hyp_data %>%
    ggplot()+
    geom_contour(aes(x, y, z = z1), breaks = 0, colour = 1)+
    geom_contour(aes(x, y, z = z2), breaks = 0, colour = 2)+
    # geom_point(data = sample_points, aes(x, y, color = name), size = 3)+ # we dont need the point data in ggplot_build()
    coord_sf()

) %>%
  .$data %>%  # keep only the data component
  map(. %>% select(x, y) %>% as.matrix %>% st_linestring) %>% # keep coordinates, turn into a linestring
  {. ->> lines1}

# make the lines an sf object
tibble(a = 1:2) %>%
  mutate(
    geom = st_sfc(lines1),
    ) %>%
  st_as_sf %>% # then make the whole thing an sf object

  # use st_intersection to find the point of intersection
  st_intersection %>%

  # then keep only the 'point' (exclude the original 'lines')
  filter(
    st_is(geom, 'POINT') 
  ) %>%
  {. ->> intersection_point} %>%
  ggplot()+
  geom_sf(colour = 'red', size = 5)+
  geom_contour(data = hyp_data, aes(x, y, z = z1), breaks = 0, colour = 1)+
  geom_contour(data = hyp_data, aes(x, y, z = z2), breaks = 0, colour = 2)+
  geom_point(data = sample_points, aes(x, y, colour = name), size = 3)

enter image description here

The limitation of this method is that it relies upon the ability of ggplot's geom_contour to visualise the contour lines. When the absolute value of r approaches the distance between two points (dist between a-b = 50), the hyperbola narrows and eventually becomes a straight line 'behind' one of the points. Here, geom_contour can't build contour lines, and therefore no line data is created, so the intersection can't be found. See how the following plot only has 5 lines when there should be 6; z6 doesn't plot and causes a warning message:

expand.grid(
  x = seq(0, 100, length = 100),
  y = seq(0, 100, length = 100)
) %>%
  as_tibble %>%
  mutate(
    # z = f1(x, y, nr)
    z1 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 5),
    z2 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 15),
    z3 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 25),
    z4 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 35),
    z5 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 45),
    z6 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 50),
  ) %>%
  ggplot()+
  geom_contour(aes(x, y, z = z1), breaks = 0, colour = 1)+
  geom_contour(aes(x, y, z = z2), breaks = 0, colour = 2)+
  geom_contour(aes(x, y, z = z3), breaks = 0, colour = 3)+
  geom_contour(aes(x, y, z = z4), breaks = 0, colour = 4)+
  geom_contour(aes(x, y, z = z5), breaks = 0, colour = 5)+
  geom_contour(aes(x, y, z = z6), breaks = 0, colour = 6)+
  geom_point(data = sample_points, aes(x, y, color = name), size = 3)+
  coord_sf()

# Warning messages:
# 1: stat_contour(): Zero contours were generated 
# 2: In min(x) : no non-missing arguments to min; returning Inf
# 3: In max(x) : no non-missing arguments to max; returning -Inf

enter image description here

z6 in theory should be represented by the dashed line below: enter image description here

I have developed methods for creating these straight lines when contour lines can't be generated, but I would like to be able to use a "mathematical/algebraical" approach to finding the point of intersection of the two lines, and not rely on the spatial/mapping/countour approach within R.

I have explored options using functions such as uniroot() and solve(), but have had limited success, possibly due to my limited understanding of the fundamental mathematics and/or the fact that the equation describing the hyperbola has both x and y terms?


My current idea involves writing a pair of equations, with an equal right hand side of 0 (apologies for incorrect maths language?):

enter image description here enter image description here

(or a trio of equations):

enter image description here enter image description here enter image description here

whereby as above, (xi, yi), (xj, yj) and (xk, yk) are the coordinates of three foci (i, j & k), and r is the range difference. xi, yi, xj, yj, xk, yk can be substituted for the coordinates of points a, b, c in the above example

Then, I think it should be possible to solve the equations using a similar process to that described here https://www.wikihow.com/Algebraically-Find-the-Intersection-of-Two-Lines, but I don't know how this translates into R code and/or existing R functions.


The aim of this is to determine the location of a transmitter (point of intersection), based on time-difference-of-arrival (TDOA) of transmissions received by receiving antennae (foci) and hyperbolic positioning principles and trilateration/multilateration (similar to how GPS works).

I will be dealing with thousands to millions of pairs of hyperbolae, so ideally this process is relatively fast and manageable by a regular computer. Also, not essential but hopefully achievable is the ability to 'scale up' a process to handle more than a pair/trio of equal equations (max ~10), in instances where the same transmission is received by more than three stations (eg 'number of satellites' when using GPS).

I would greatly appreciate absolutely any ideas or help anyone can offer, as I have been thinking about and working on trying to figure this out for quite some time. Thank you.

1

There are 1 best solutions below

0
On

Update:

For anyone playing along at home, the nleqslv::nleqslv() function can be used to solve simultaneous equations (non-linear equation solver; n-l-eq-slv()).

See various examples:

Basically, you supply nleqslv() with some starting values* x (from ?nleqslv(): an initial guess of the root), and a function fn which contains the equations to be solved.

For this example, the f1() function we specified in the question which calculates z values for each value of x and y, is the equation of each hyperbola, and forms the equations to be solved.

f1 <- function(xi, yi, xj, yj, x, y, r){
  sqrt((xj - x)^2 + (yj - y)^2) - sqrt((xi - x)^2 + (yi - y)^2) - r
}

This translates to something like this when using nleqslv():

nleqslv::nleqslv(c(x1_start, x2_start), function(x){
  z <- numeric()
  z[1] = sqrt((xj - x[1])^2 + (yj - x[2])^2) - sqrt((xi - x[1])^2 + (yi - x[2])^2) - r_ij
  z[2] = sqrt((xk - x[1])^2 + (yk - x[2])^2) - sqrt((xi - x[1])^2 + (yi - x[2])^2) - r_ik
  z
})

Where xi, yi etc are the x and y coords of point i, and r_ij is the range difference between the point of intersection and points i and j. x in the original equation is replaced with x[1], and y is replaced with x[2]; my understanding is that the resulting vector (slightly confusingly also named x) will have 2 components, which are x[1], x[2] in the function fn, or x, y in the original equation(s) based on f1(). x1_start and x2_start are the starting values of x[1] (x) and x[2] (y).

From our sample_points from the question, we call points a, b, c, i, j, k respectively. Then, substituting the values of xi, yi, xj, yj, xk, yk, d_ij, d_ik, and some starting values* of c(25, 25), we get something that looks like this:

library(nleqslv)

nleqslv::nleqslv(c(25, 25), function(x){
  z <- numeric()
  z[1] = sqrt((75 - x[1])^2 + (25 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
  z[2] = sqrt((50 - x[1])^2 + (75 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
  z
}) %>%
  {. ->> result_list}

nleqslv() produces a list with various values and messages explaining how the equations were solved (see ?nleqslv() for details), but the first component of the list named x contains the solutions, in a vector of length 2.

result_list[[1]]
# [1] 46.95886 42.22943

The first element is our value for x[1] (x in the original equation), and the second is x[2] (y).

As we can see, this point matches up with the point of intersection of our original hyperbolae:

# extract the values of x and y from result_list for plotting
px <- result_list[[1]][1]
py <- result_list[[1]][2]

# plot the sample hyperbolae as in the question
expand.grid(
  x = seq(0, 100, length = 100),
  y = seq(0, 100, length = 100)
) %>%
  as_tibble %>%
  mutate(
    # z = f1(x, y, nr)
    z1 = f1(xi = 25, yi = 25, xj = 75, yj = 25, x, y, r = 5), # i = point a, j = point b
    z2 = f1(xi = 25, yi = 25, xj = 50, yj = 75, x, y, r = 5)  # i = point a, j = point c
  ) %>%
  {. ->> hyp_data} %>%
  ggplot()+
  geom_contour(aes(x, y, z = z1), breaks = 0, colour = 1)+
  geom_contour(aes(x, y, z = z2), breaks = 0, colour = 2)+
  geom_point(data = sample_points, aes(x, y, color = name), size = 3)+
  coord_sf()+

  # and add in the newfound point
  geom_point(aes(x = px, y = py), size = 5, shape = 1, colour = 'red')

enter image description here

So, we have found the point of intersection of the hyperbolae - yay!

*Choosing starting values

However, the choice of starting values remains contentious. In this instance I chose the starting values of c(25, 25), based on the minimum x and y values of the sample_points, because I 'guessed' the root (intersection) must be somewhere near here.

Be aware that alternative starting values can return an incorrect root. Luckily, there seems to be ways to try and catch this out.

Part of result_list is termcd, which is the termination code. From what I have seen, values of 1 and 2 means the root has been found, but other values tend to indicate the (correct) root wasn't found (despite returning incorrect values to the user). See ?nleqslv for the full list of termination codes (personally I don't understand most of the words!).

As an example, we run nleqslv() using integar combinations of x and y between 0 and 100 to see which produce the correct root, and what values of termcd are returned:

# set up a grid with every combo of x and y between 0 and 100
expand.grid(
  x_grid = seq(0, 100, length = 101),
  y_grid = seq(0, 100, length = 101)
) %>%
  as_tibble %>%
  rowwise %>%

  # now we run nleqslv(), and extract the x, y, and termcd values
  mutate(

    px = nleqslv::nleqslv(c(x_grid, y_grid), function(x){
      z <- numeric()
      z[1] = sqrt((75 - x[1])^2 + (25 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
      z[2] = sqrt((50 - x[1])^2 + (75 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
      z
      })[[1]][1],

    py = nleqslv::nleqslv(c(x_grid, y_grid), function(x){
      z <- numeric()
      z[1] = sqrt((75 - x[1])^2 + (25 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
      z[2] = sqrt((50 - x[1])^2 + (75 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
      z
      })[[1]][2],

    termcd = nleqslv::nleqslv(c(x_grid, y_grid), function(x){
      z <- numeric()
      z[1] = sqrt((75 - x[1])^2 + (25 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
      z[2] = sqrt((50 - x[1])^2 + (75 - x[2])^2) - sqrt((25 - x[1])^2 + (25 - x[2])^2) - 5
      z
      })[[3]][1],

    ) %>%

  {. ->> grid_results}


# in this instance, we know the root/intersection is roughly (47, 42), so let's check
#   which starting values produce the correct result
grid_results %>%
  ungroup %>%

  # work out which starting values got the correct point
  mutate(
    correct = ifelse(round(px) == 47 & round(py) == 42, 'Y', 'N')
  ) %>%

  {. ->> grid_results_2} %>% 

  # then plot, showing colour for correct/incorrect, and include termcd as text
  ggplot()+
  geom_raster(aes(x_grid, y_grid, fill = factor(correct)))+
  coord_sf()+
  geom_text(aes(x_grid, y_grid, label = termcd), size = 2)+
  geom_point(data = sample_points, aes(x, y, color = name), size = 10)+
  geom_text(data = sample_points, aes(x, y, label = name))+
  geom_contour(data = hyp_data, aes(x, y, z = z1), breaks = 0, colour = 1, size = 1)+
  geom_contour(data = hyp_data, aes(x, y, z = z2), breaks = 0, colour = 2, size = 1)+
  labs(
    x = 'starting x',
    y = 'starting y',
    fill = 'Correct answer?',
    colour = 'sample_point name'
    )

enter image description here

Blue cells are starting values that returned the correct root, red cells returned the wrong root. The numbers are the termcd values, blue areas are either 1 or 2, red areas are other values.

So, as long as the starting values you choose are correct, you should get the right answer. It looks like the correct answers have termination codes of 1 & 2 exclusively, and termination codes of 1 and 2 have correct answers, exclusively, at least in this instance:

# which codes were returned for correct and incorrect answers?
grid_results_2 %>% 
  group_by(termcd) %>% 
  distinct(correct) %>% 
  arrange(termcd)

# # A tibble: 6 x 2
# # Groups:   termcd [6]
#  termcd correct
#   <int> <chr>
# 1     1 Y
# 2     2 Y
# 3     3 N
# 4     4 N
# 5     5 N
# 6     6 N

So for this example, I guess you can run through multiple options for starting values, and then keep only the values with termination codes of 1 or 2 to ensure you get the correct answer.

I guess a limitation of this approach is that you have to run through 10,000 iterations of starting values for each instance - when there is 10's to 100's of thousands (or millions) of instances, this may slow processing times down. To alleviate this, I ran the same code as above using 25 starting values rather than 10,000, and it returned the correct answer in most instances, with the same pattern of termination codes: enter image description here

For now, this one working solution.