Running a TensorFlow Image Recognition API to search for an object

363 Views Asked by At

TensorFlow has an api using the inception v3 model for identifying objects. I was wondering, if there was any way to locate smaller images in a larger image. For example, locating all oranges on an orange tree. I tried splitting the larger image into a grid of smaller images and applying tensorflow on each individual smaller image but having a constant grid is extremely error-prone, is there any solution around this?

1

There are 1 best solutions below

0
On BEST ANSWER

The term you're looking for is object detection. You can use a sliding window at different scales. This is one way, there's probably better ones out there, but I don't know what they are.

Let's say some oranges are closer than others. Start with a 10x10 (or something) box in the top left corner, and see if your model classifies it as an orange. Move your box to the right 2 pixels (or something). Try again. Keep moving right, then move down 2 pixels and start a new row, etc. Now resize the image to be smaller (so now you're looking for bigger oranges), and repeat the whole process. You can google things like "sliding window detection", and "image pyramid" to find out more.

Once you've gone through your image, you'll have a bunch of detections - you'll have to figure out some way to perform non-maximum suppression on your detections since you might have way too many.