I'm downloading a lot of images from imgur.com with a Python script and since I have all the links in the format http://imgur.com/{id} I have to force download them by replacing the original url with http://i.imgur.com/{id}.gif, then saving all the images without extension. (I know that there is an Imgur's API but I can't use it since it have limitations for this kind of job)
Now after downoading images, I want to use imghdr module to determine the original extension of the image:
>>> import imghdr
>>> imghdr.what('/images/GrEdc')
'gif'
The problem is that this works with a success rate of 80%, the remaining 20% are all identified as 'None' and checking some of them I noticed that they are most likely all .jpg images.
Why imghdr can't detect the format? I'm able to open theese images with Ubuntu's default image viewer even without extension, so I don't think they are corrupted.
That is a know problem in the lib, it don't detect fine some valid JPEG images.
You can use a modification of the lib that detect better all the JPEG images, specially in your case that you know for sure that all the files are images.
https://bugs.python.org/issue28591
If even with this fixed lib you fail to detect some images then you can try with pillow that support a more large number of formats but is less lightweight and is a external dependencies not included in the python build-in libs.