COCO dataset number of images per classes

2.4k Views Asked by At

I see COCO2017 has 80 classes 118k training and 5k validation dataset(122k images). I have a question here. Does the number of images per classes(1525 images per class) which is ~ 122k / 80?

1

There are 1 best solutions below

0
On

the COCO dataset is not an evenly distributed dataset, i.e., all the classes do not have the same number of images. So, let me show you a way to find out the number of images in any class you wish.

I am using the PyCoco API to work with the COCO dataset. Let's find out the number of images in the 'person' class of the COCO dataset. Here is a code gist to filter out any class from the COCO dataset:

# Define the class (out of the 80 COCO classes)
filterClasses = ['person']

# Fetch class IDs only corresponding to the filterClasses
catIds = coco.getCatIds(catNms=filterClasses) 

# Get all images containing the above Category IDs
imgIds = coco.getImgIds(catIds=catIds)
print("Number of images containing the class:", len(imgIds))

There, we get the number of images corresponding to 'person' in the dataset!

I have recently written an entire post on exploring and manipulating the COCO dataset. Do have a look to get more details and the entire code.