Find a pattern image (binary file)

593 Views Asked by At

For string variable in DigitalMicrograph, we can find the position of a particular pattern using the "find" function:

Number find( String str, String sub_str )

I would like to do the same but with image data. For example, I can create an image with

image img := exprsize(1024, icol);

and the pattern I want to find is

image pattern := exprsize( 15, icol+64 );

In above case, we know the offset of pattern w.r.t. the data is at column number 64. A real case we won't have a such simple pattern (i.e. a straight line). A brutal force approach with a "for" loop will certainly work but it gets painfully slow when the data size is getting bigger. Anyone has a better/elegant suggestion? 1D image may be easier, how about 2D image?

Many thanks!

3

There are 3 best solutions below

1
On BEST ANSWER

As requested, here is a snipped showing how one could do a search in a "raw" data stream. I'm not claiming that the script below is the fastest or most elegant solution, it is just showing how the according commands work. (You find them documented in the "File Input and Output" section of the online F1 help.)

The 'idea' I've put into it: Just search for the occurrences of last value of your search pattern in the stream. Only when found, see if the start-value at given distance would also match. Only in this case, check the whole pattern. This should be a useful method for long search patterns, but it might not be so optimal for very short ones.

{
    number patternSize = 8
    number dataSize = 24000
    number patternPos = trunc( random() * ( dataSize - patternSize ) )

    number const = 200
    number dataTypeSizeByte  = 4
    number stream_byte_order = 0

    // Prepare test-Dummies
        image searchSet := IntegerImage( "search", dataTypeSizeByte, 0, patternSize )
        searchSet = const * sin( icol/iwidth *  Pi() )
        // searchSet.ShowImage()

        image dataSet := IntegerImage( "data", dataTypeSizeByte, 0, dataSize ) 
        dataSet = const * random() * 0.3
        dataSet.Slice1( patternPos, 0, 0, 0, patternSize, 1 ) = searchSet
        // dataSet.ShowImage()

    // Prepare Data as RawStream
        object buffer = NewMemoryBuffer( dataSize * dataTypeSizeByte )
        object stream = NewStreamFromBuffer(buffer)
        dataSet.ImageWriteImageDataToStream( stream, stream_byte_order )
        stream.StreamSetPos(0,0)

    // Prepare aux. Tags for streaming
        TagGroup tg = NewTagGroup();
        tg.TagGroupSetTagAsUInt32( "UInt32_0", 0 )

    // Prepare values to search for 
        number startValue = searchSet.GetPixel(0,0)
        number lastValue =  searchSet.GetPixel(patternSize-1,0)

    // search for the pattern
        // Search for the LAST value of the pattern only.
        // If found, check if the FIRST value in appropriated distance also matches
        // Only then compare whole pattern.

        number value
        number streamEndPos = stream.StreamGetSize() 
        number streamPos = (patternSize-1) * dataTypeSizeByte // we can skip the first few tests
        stream.StreamSetPos(0, streamPos )  
        while( streamPos < streamEndPos )
        {
            tg.TagGroupReadTagDataFromStream( "UInt32_0", stream, stream_byte_order )
            streamPos = stream.StreamGetPos()

            tg.TagGroupGetTagAsUInt32( "UInt32_0", value )  // use appropriate data type!
            if ( lastValue == value )
            {
                result("\n Pattern might end at: "+streamPos/dataTypeSizeByte)

                // shift to start-value (relative) to check first value!
                stream.StreamSetPos(1, -1 * patternSize * dataTypeSizeByte )    
                tg.TagGroupReadTagDataFromStream( "UInt32_0", stream, stream_byte_order )
                tg.TagGroupGetTagAsUInt32( "UInt32_0", value )  
                if ( startValue == value )
                {
                    result("\t (Start also fits!) " )

                    // Now check all of it!
                    stream.StreamSetPos(1, -1 * dataTypeSizeByte )  
                    image compTemp := IntegerImage( "SectionData", dataTypeSizeByte, 0, patternSize )
                    compTemp.ImageReadImageDataFromStream( stream, stream_byte_order )

                    if ( 0 == sum( abs(compTemp - searchSet) ) )
                    {
                        number foundPos = (stream.StreamGetPos()/dataTypeSizeByte - patternSize)
                        Result("\n Correct starting position: " + patternPos )
                        Result("\n Found starting position  : " + foundPos )
                        OKDialog( "Found subset at position : " + foundPos )
                        exit(0)
                    }       
                }
                stream.StreamSetPos(0, streamPos )  
            }   
    }
    OKDialog("Nothing found.")
}
1
On

Given that you are effectively looking for an exact match to numeric data, then judicious use of image expressions may be the most efficient path to a solution. Roughly following your example, we begin by setting up source data and target pattern:

Image sourceData := RealImage("Source data", 4, 4096);
sourceData = Random();

Image targetPattern := RealImage("Target pattern", 4, 15);
targetPattern = sourceData.Index(icol + 1733, 0);

Then we prepare a carefully arranged search buffer with a single image expression:

Number targetSize = targetPattern.ImageGetDimensionSize(0);
Number searchBufferW = sourceData.ImageGetDimensionSize(0) - targetSize;
Image searchBuffer := RealImage("Search buffer", 4, searchBufferW, targetSize);
searchBuffer = sourceData.Index(icol + irow, 0);

This arranges all potential matching subsets of the source data in vertical columns of a 2D image. Finally we do a little image math to locate the match to the target pattern, if one exists:

searchBuffer = Abs(searchBuffer - targetPattern.Index(irow, 0));
Image projectionVector := targetPattern.ImageClone();
projectionVector = 1.0;
Image searchResult := projectionVector.MatrixMultiply(searchBuffer);

Number posX, posY;
Number wasFound = (searchResult.Min(posX, posY) == 0);
String resultMsg = (wasFound) ? "Pattern found at " + posX : "Pattern not found";
OKDialog(resultMsg);

The first line will yield an exact zero in every pixel of the search buffer column that matches the target pattern. Vertically summing the search buffer and using the Min() function to find a zero speeds up the search for a match.

Note the use of MatrixMultiply() to do a rapid vertical sum projection. This will only work for type Real (4-byte floating point) source data. There are, however, slightly more complex approaches to rapid data projection that will also give a fairly quick result for any numeric data type.

Although illustrated for a 1D pattern in a 1D data set, this approach can probably be extended to 1D and 2D patterns in 2D and 3D data sets by using a multi-dimensioned search buffer and more advanced indexing using ImageDataSlice objects, but that would be a subject for another question.

2
On

As Mike has pointed out, a cross-correlation is a good way to search for a pattern in the presence of noise. However, it is even better (if not the perfect method) to search in the absence of noise! This will work in 1D and 2D for scripting. See below

number sx = 1024
number sy = 1024
number pw = 32
number ph = 32
number px = 100 // trunc( random()*(sx-pw) )
number py = 200 // trunc( random()*(sy-ph) )

image test := RealImage("Data",4,sx,sy)
test = random()
image pattern := test[py,px,py+ph,px+pw].ImageClone()
//test.showimage()
//pattern.showimage()
image patternSearch = test*0
patternSearch[0,0,ph,pw] = pattern
//patternSearch.ShowImage()

image corr := CrossCorrelate(test,patternSearch)
corr.ShowImage()
number mx,my,mv
mv = max(corr,mx,my)
mx -= trunc(sx/2)       // because we've placed the pattern in the 
my -= trunc(sy/2)       // top/left of the search-mask
Result("\n Pattern = " + px + " / " + py )
Result("\n max = " + mv + " at " + mx + "/" + my )

image found = test*0
found[my,mx,my+ph,mx+pw]=pattern
rgbImage overlay = RGB((test-found)*256,found*256,0)
overlay.ShowImage()

If your problem is only 1D and you've very large data, then an alternative approach might give you a quicker solution. I would then suggest to try to use RAW-data streaming (via the TagGroup Streaming commands) and use any additional information you have to adjust the search, i.e. search only for the beginning of a pattern in the stream and then only verify on "hit" etc.

Notes added here to address issue regarding search pattern in 1D image. If we run the following scripts couple of times then we can find it fails to find the pattern properly about 50% of time.

number sx = 1024
number sy = 0
number pw = 16
number ph = 0
number px = trunc( random()*(sx-pw) )
number py = 0 // trunc( random()*(sy-ph) )

image test := RealImage("Data",4,sx );
test = random();
image patternSearch := exprsize( sx, icol<pw? test[icol+px, irow]: 0 );
// test.ShowImage();
// patternSearch.ShowImage();
patternSearch.SetName( "PatternSearch" );
//

image corr := CrossCorrelate(test,patternSearch)
// corr.ShowImage()
number mx,my,mv
mv = max(corr,mx,my)
mx -= trunc(sx/2)       // because we've placed the pattern in the 
my -= trunc(sy/2)       // top/left of the search-mask
if( mx <= 0 ) mx += sx;
Result("\n\n Pattern = " + px + " / " + py )
Result("\n max = " + mv + " at " + mx + "/" + my )