How to detect ending location (x,y,z) of certain sequence in 3D domain

122 Views Asked by At

I have protein 3D creo-EM scan, such that it contains a chain which bends and twists around itself - and has in 3-dimension space 2 chain endings (like continuous rope). I need to detect (x,y,z) location within given cube space of two or possibly multiplier of 2 endings. Cube space of scan is presented by densities in each voxel (in range 0 till 1) provided by scanning EM microscope, such that "existing matter" gives values closer to 1, and "no matter" gives density values closer to 0. I need a method to detect protein "rope" edges (possible "rope ending" definition is lack of continuation in certain tangled direction. Intuitively, I think there could be at least 2 methods: 1) Certain method in graph theory (I can't specify precisely - if you know one - please name or describe it. 2) Derivatives from analytic algebra - but again I can't specify specific attitude - so please name or explain one. Please specify computation complexity of suggested method. My project is implemented in Python. Please help. Thanks in advance.

2

There are 2 best solutions below

0
On

One approach would be to choose a threshold density, convert all voxels below this threshold to 0 and all above it to 1, and then look for the pair of 1-voxels whose shortest path is longest among all pairs of 1-voxels. These two voxels should be near the ends of the longest "rope", regardless of the exact shape that rope takes.

You can define a graph where there is a vertex for each 1-voxel and an edge between each 1-voxel and its 6 (or possibly 14) neighbours. You can then compute the lengths of the shortest paths between some given vertex u and every other vertex in O(|V|) time and space using breadth first search (we don't need Dijkstra or Floyd-Warshall here since every edge has weight 1). Repeating this for each possible start vertex u gives an O(|V|^2)-time algorithm. As you do this, keep track of the furthest pair so far.

If your voxel space has w*h*d cells, there could be w*h*d vertices in the graph (if every single voxel is a 1-voxel), so this could take O(w^2*h^2*d^2) time in the worst case, which is probably quite a lot. Luckily there are many ways to speed this up if you can afford a slightly imprecise answer:

  • Only compute shortest paths from start vertices that are at the boundary -- i.e. those vertices that have fewer than 6 (or 14) neighbours. (I believe this won't sacrifice an optimal solution.)
  • Alternatively, first "skeletonise" the graph by repeatedly getting rid of all such boundary vertices whose removal will not disconnect the graph.
  • A good order for choosing starting vertices is to first choose any vertex, and then always choose a vertex that was found to be at maximum possible distance from the last one (and which has not yet been tried, of course). This should get you a very good approximation to the longest shortest path after just 3 iterations: the furthest vertex from the start vertex will be near one of the two rope ends, and the furthest vertex from that vertex will be near the other end!

Note: If there is no full-voxel gap between distant points on the rope that are near each other due to bending, then the shortest paths will "short-circuit" through these false connections and possibly reduce the accuracy. You might be able to ameliorate this by increasing the threshold. OTOH, if the threshold is too high then the rope can become disconnected. I expect you want to choose the highest threshold that results in only 1 connected component.

0
On

If you want to enumerate each continuous path (thereby obtaining the ends of each path) in your 3D scan you could, for each position, apply a basic depth-first-search, for example:

//applied at some voxel
dfs(...)
    for each surrounding voxel 
        dfs(...)

Or in detail:

class coordinate{
    x
    y
    z
    visited
}

initialize pathList
initialize coords

add all coordinates which contain "matter" to coords

dfs(coordinate,path)
    coordinate.visited = TRUE
    isEnd = TRUE
    FOR each coordinate
        //check each of the 26 possible locations (total 26 conditionals)
        IF coordinate.get(x-1,y-1,z+1) IN coords AND 
           NOT coordinate.get(x-1,y-1,z+1).visited THEN
            isEnd = FALSE
            path += coordinate.get(x-1,y-1,z+1)
            dfs(coordinate.get(x-1,y-1,z+1),path)
        ...
        IF coordinate.get(x+1,y+1,z-1) IN coords AND 
           NOT coordinate.get(x+1,y+1,z-1).visited THEN 
            isEnd = FALSE
            path += coordinate.get(x+1,y+1,z-1) 
            dfs(coordinate.get(x+1,y+1,z-1),path)
    IF isEnd THEN
        add path to pathList
    remove coordinate from coords

WHILE coords isn't empty
    dfs(coords.get(0),"")

The general procedure (dfs) is well-documented on dozens of other sites but if you want to test it here's some crude java (I'm not too familiar with python) that mirrors what's above:

public class C {

    ArrayList<Coordinate> coords = new ArrayList<>();
    ArrayList<String> paths = new ArrayList<>();



    static class Coordinate {
        int x, y, z;
        boolean visited;
        Coordinate(int x,int y,int z){
            this.x = x;
            this.y = y;
            this.z = z;
            visited = false;
        }

        public String toString() {
            return "("+x+","+y+","+z+")";
        }
    }



    void dfs(Coordinate c,String path) {
        c.visited=true;
        path+=c.toString();
        boolean isEnd = true;
        //apply dfs to 26 possible neighbors
        for(int x = c.x-1; x <= c.x+1; x++) {
            for (int y = c.y-1; y <= c.y+1; y++) {
                for (int z = c.z+1; z >= c.z-1; z--) {
                    Coordinate coord = getCoordinate(x,y,z);
                    //if coord exists and it's not been visited
                    if(coord!=null && !coord.visited && !coord.equals(c)) {
                        isEnd = false;
                        dfs(coord, path);
                    }
                }
            }
        }
        if(isEnd) paths.add(path);

        coords.remove(c);
    }

    Coordinate getCoordinate(int x,int y,int z){
        for(Coordinate b: coords){
            if(x==b.x && y==b.y && z==b.z) return b;
        }
        return null;
    }


    void search(){
        //while there are points in 3d space extend a path from one
        while(!coords.isEmpty()){
            dfs(coords.get(0),"");
        }
    }




    public static void main(String[] args) {

        C coord = new C();

        //for each place where there exists matter
        //create a coordinate object and add to coords
        // for example:
        coord.coords.add(new Coordinate(0,0,0));
        coord.coords.add(new Coordinate(-1,1,1));
        coord.coords.add(new Coordinate(1,1,1));
        coord.coords.add(new Coordinate(-1,2,2));
        coord.coords.add(new Coordinate(-1,0,2));
        coord.coords.add(new Coordinate(1,2,2));
        coord.coords.add(new Coordinate(1,0,2));

        coord.search();
        //print out each path on separate line, 
        //the path endings can easily be obtained from this 
        for(String s:coord.paths) System.out.println(s);

    }
}