Why does yolov7 merge the original pictures to create the batches for training?

357 Views Asked by At

I am trying to create a customized detection of pictures, e.g. for planets. I resized all the pictures all to 1280x1280px. I start train.py with following input:

python .\train.py --workers 1 --device 0 --batch-size 4 --epochs 100 --img-size 1280 1280 --data .\data\custom_data.yaml --hyp .\data\hyp.scratch.custom.yaml --cfg .\cfg\training\yolov7-custom.yaml --name yolov7-result --weights .\yolov7.pt

When looking at my created training data, it seems like yolov7 "merged" the pictures, so the pictures it learns from look like abominations from the original data. Here, take a look:

planets_train_batch

It takes 4 pictures (as set in batch-size = 4), but they are not the original pictures. For example, planet_14.jp looks like that:

original planet_14.jpg

As you can see, it merges with other pictures from planets in the train images folder. But why is that, and how do i prevent it from happening ?

I tried the same with 640x640px, the result was the same, only smaller batches. I tried googling at, but nobody had similar problems.

1

There are 1 best solutions below

1
On

This is a data augmentation strategy (I think it was introduced in the YoloV4 paper) which they call mosaic. As with any data augmentation strategy, the concept is pretty simple: You have a limited amount of labeled training data. In many or most machine learning tasks, the performance of the model scales with the amount of (suitably unique) training data. By manipulating the limited training data (rotating, scaling, masking, changing color, combining multiple images, etc), you can effectively increase the number of data examples you have for training. (It would be hard to say whether these augmented data examples are AS useful as new, unique data examples but that discussion is probably outside of the scope of this question.

In any case, without knowing the particular model implementation you're using it's hard to tell where under the hood this happens (whether in dataset or in model) but I'd guess it probably happens in the __getitem__ method of the dataset object. You can likely suppress various data augmentation strategies here.