I've been trying to figure out how to find a O(n) time complexity algorithm to solve the following in Java:
We are given an input pair with a start point and end point, and we have to construct a path such that the start of one input matches the end of another input (in this case, alphabetically)
EX: if I have a list
P B
B M
O P
M Z
where PB, or BM are pairs, P being the start point and B being the end
I should create as output
O P
P B
B M
M Z
And naturally, I want to check for cycles -- in this case I will just report if there is a cycle.
The first time I tried to do this I used a swapping algorithm that was O(N^2), by basically swapping two entries if there was a match and moving down the list. There is a definitely a faster way to do this, and I understand semi-intuitively how to do it but I was wondering if someone could clear it up a little.
Basically, I'm assuming you need to make some sort of hash/key type of structure, where you can reference the value of the object itself by a key.
For example, you would keep two collections, one of starts and one of ends. You could take each input and create an object that has a start and end field, and then add all of the starts and ends to two arrays. Then you would have to find basically end[start] for each start and add its respective object to a list after finding them.
My only thing is, I can't figure out exactly how to implement this in Java using a hash table or some similar data structure. Do I have to add the starts and ends to separate hash tables and then use the start points as lookup keys?
Pseudocode from looking at someone who solved it in python on github:
for inputs
parse input
add parse[1] to starts, add parse[2] to ends
for starts
find origin (a start not in ends) <--requires hash?
if no origin
cycle exists
for inputs
find ends[origin] <--requires hash?
origin = ends[origin] <-- so we can find the next one
Wondering if someone could help me translate this to an algorithm in Java (other efficient solutions very welcome since I'm interested in this type of problem solving) or understanding it more generally from a data structures perspective.
Here's a simple implementation using HashMaps in Java:
end
stores the end point (value) given the start point (key).If a point exists as a start point but not an end point then it will be a key in
end
but not a value inend
. So iterating over the keySet ofend
and checking ifend
contains each key as a value will find an origin. If there is no such origin then there is a cycle. However, this is not sufficient to determine if there is a cycle because there may be a cycle even if there is an origin. Consider this set:There is a unique origin (O) but there is also a cycle M -> Z -> M. To detect this you can walk the set and keep track of which point you have already visited, or more simply you can walk the set and if you end up visiting more points than the length of the input then there must be a cycle.
To walk the set, set a string to the origin. Then keep looking up the value in
end
givenorigin
as the key, and print out the key, value pair (origin, end.get(origin)) as the start, end. Then set end.get(origin) as the new origin. The loop terminates when there is no value found in the end HashMap for the current origin.This algorithm fails if there are duplicate start or end positions in the list, e.g:
It's not clear from the question if you have to handle this case.