This is a technical question on preparing a dataset.
I'm trying to follow this official example
https://github.com/pytorch/examples/tree/master/imagenet
but I cannot even start with because I don't understand the requirements. It says
- Install PyTorch (pytorch.org)
pip install -r requirements.txt
- Download the ImageNet dataset from http://www.image-net.org/ Then, and move validation images to labeled subfolders, using the following shell script
For the first requirement, I'm working on Colab, so I don't think I need to install PyTorch again on my local pc.
The second one doesn't work, as there's obviously no module named "requirements.txt". This is where I'm beginning to realize there's something on this git repo that I completely don't understand how to use. Anyway, I could just open the text file from the git repo directly, and it just says use torch
and torchvision
. Okay, I have no problem importing them.
The third requirement. So I went to ImageNet website and signed the agreement for the research use. Now the requirement tells me to download THE ImageNet data, but I see bunch of various options there (like by published years, purposes like for a competition, resolution, etc.). Which one is THE DATASET?
I'm new to PyTorch, and I think I'm missing some protocol about how the PyTorch dev community provides examples via this way...
Any help will be appreciated. Thank you.
It's the
requirements.txt
file in that repo. You can add package names in a file such as this and install all packages at once using pip, that's whypip install -r requirements.txt
. Of course, since it only contains torch and torvision, you don't need to install it as these are already installed on google colab.I can't access this page without signing up, though you can download any dataset (of any year etc), the important thing is that in order to train it using pytorch using
Imagefolder
api (which is the one used in the repo you mentioned), its structure should be like this:You can use the script they mentioned for Imagenet data to do so.
If you're just getting started with pytorch, I'd advise you to go through pytorch tutorials such as this one.