Content search of uploaded image in mongodb

882 Views Asked by At

I have to search contents within file . that is uploaded in db like image(bmp,tiff,png) or pdf etc.

I am using latest release Mongodb for storing images(png,bmp,jpg) or documents using GridFS. that is storing data in binary . MongoDB uses two ways to store documents one of them binary and other one is json .

so Mongodb does not provide the way to search contents in image directly . other is that for me i can use OCR but OCR provides end result in string so i have to convert that to valid json to store in db. if it is last option for me then how will i convert that string to valid json format .

I am trying to store text file in mongodb with following code .

// result5.txt is a text file that is result of OCR.        

string text = System.IO.File.ReadAllText("E:\\result5.txt");
    
var document = BsonSerializer.Deserialize<BsonDocument>(text);

var collection = Database.GetCollection("articles");
         
collection.Insert(text);

but i am getting an error .

MongoCommandException: Command insert failed: Wrong type for documents[0]. Expected a object, got a string.

how can i search within image file that i have uploaded in db .??

so any suggestion will be appreciated ,please admin don't turn off comment for this post thanks .

text data stored in this form . enter image description here

1

There are 1 best solutions below

1
On BEST ANSWER

Just create new class to contain OCR results:

public class OcrContainer
{
    public BsonObjectId Id { get; set; }
    public string OcrResult { get; set;}
}

and than store it to mongo:

var collection = Database.GetCollection<OcrContainer >("articles");
collection.InsertOne(new OcrContainer {OcrResult = text});

after that you could search your results:

collection.Find(x=>x.OcrResult.Contains("bla"))

But: What are you going to do with it? You will need more properties in OcrCollection to connect with ocr results with your other data.