How to extract metadata from .heic image files on Windows 11 with Python?

584 Views Asked by At

I'm running Python 3.10 on Windows 11. I need to extract metadata from .heic image files. Here is what I tried:

1. ExifRead

I tried with ExifRead (see https://pypi.org/project/ExifRead/) but that failed:

>>> import exifread
>>> f = open("path/to/img.heic", 'rb')
>>> tags = exifread.process_file(f)
Traceback (most recent call last):
  File "C:\Python310\lib\site-packages\exifread\heic.py", line 171, in get_parser
    return defs[box.name]
KeyError: 'hdlr'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python310\lib\site-packages\exifread\__init__.py", line 137, in process_file
    offset, endian, fake_exif = _determine_type(fh)
  File "C:\Python310\lib\site-packages\exifread\__init__.py", line 109, in _determine_type
    offset, endian = heic.find_exif()
  File "C:\Python310\lib\site-packages\exifread\heic.py", line 268, in find_exif
    meta = self.expect_parse('meta')
  File "C:\Python310\lib\site-packages\exifread\heic.py", line 159, in expect_parse
    return self.parse_box(box)
  File "C:\Python310\lib\site-packages\exifread\heic.py", line 177, in parse_box
    probe(box)
  File "C:\Python310\lib\site-packages\exifread\heic.py", line 195, in _parse_meta
    psub = self.get_parser(box)
  File "C:\Python310\lib\site-packages\exifread\heic.py", line 173, in get_parser
    raise NoParser(box.name) from err
exifread.heic.NoParser: hdlr

2. pyheif

I tried to install the pyheif module, but there is no build for Windows.

3. pillow

I tried with the pillow module (aka PIL):

>>> from PIL import Image
>>> img = Image.open("path/to/img.HEIC")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python310\lib\site-packages\PIL\Image.py", line 3280, in open
    raise UnidentifiedImageError(msg)
PIL.UnidentifiedImageError: cannot identify image file 'C:/Backup/Pictures_2023/IMG_0620.HEIC'
3

There are 3 best solutions below

0
K.Mulier On

I found a way to extract the metadata in Python 3.10 on Windows 11:

import subprocess

def get_photo_metadata(filepath):
    filepath = filepath.replace('/', '\\')
    filepath = filepath.replace('\\', '\\\\')
    cmd = f'cmd.exe /c wmic datafile "{filepath}" list full'
    output = subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        shell=True,
    ).communicate()[0]
    output_utf = output.decode('utf-8', errors='ignore')
    return output_utf

print(
    get_photo_metadata("path/to/img.HEIC")
)

This prints:

AccessMask=1507775
Archive=TRUE
Caption=C:\Backup\Pictures_2023\CHINA\202308_b\IMG_0620.HEIC
Compressed=FALSE
CompressionMethod=
CreationClassName=CIM_LogicalFile
CreationDate=20230817080632.333836+120
CSCreationClassName=Win32_ComputerSystem
CSName=SKIKK-2022
Description=C:\Backup\Pictures_2023\CHINA\202308_b\IMG_0620.HEIC
Drive=c:
EightDotThreeFileName=c:\backup\pictures_2023\china\202308_b\img_06~3.hei
Encrypted=FALSE
EncryptionMethod=
Extension=HEIC
FileName=IMG_0620
FileSize=1823601
FileType=HEIC File
FSCreationClassName=Win32_FileSystem
FSName=NTFS
Hidden=FALSE
InstallDate=20230817080632.333836+120
InUseCount=
LastAccessed=20230904163653.526369+120
LastModified=20230808123918.000000+120
Manufacturer=
Name=C:\Backup\Pictures_2023\CHINA\202308_b\IMG_0620.HEIC
Path=\backup\pictures_2023\china\202308_b\
Readable=TRUE
Status=OK
System=FALSE
Version=
Writeable=TRUE

Note: I made the following script to extract dates from the photo:

import subprocess
import re
import datetime

p = re.compile(r'(20\d\d)(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)\.(\d*)')

def get_photo_dates(filepath) -> datetime.datetime:
    filepath = filepath.replace('/', '\\')
    filepath = filepath.replace('\\', '\\\\')
    cmd = f'cmd.exe /c wmic datafile "{filepath}" list full'
    output = subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        shell=True,
    ).communicate()[0]
    output_utf = output.decode('utf-8', errors='ignore')
    dates = []
    for m in p.finditer(output_utf):
        dates.append(
            datetime.datetime(
                year        = int(m.group(1)),
                month       = int(m.group(2)),
                day         = int(m.group(3)),
                hour        = int(m.group(4)),
                minute      = int(m.group(5)),
                second      = int(m.group(6)),
                microsecond = int(m.group(7)),
            )
        )
    return sorted(dates)
0
James On

I too have been struggling with this. Exiftool works well, but only supports windows (I'm tearing my hair out trying to find an OS-agnostic alternative that can be installed easily from requirements.txt).

Anyhoo, here you go:

import exiftool               # Install PyExifTool
exiftool_path = os.path.abspath("./exiftool.exe")
os.environ['EXIFTOOL_PATH'] = exiftool_path

def get_metadata(files):
    metadata = []
    with exiftool.ExifToolHelper() as et:
        metadata = et.get_metadata(files)
        return metadata

Download & rename exiftool as per https://exiftool.org/install.html but instead of putting it in your windows folder, put it in the same folder as your script. My import statement takes care of the 'PATH' stuff.

The 'files' variable can be a path, or a list of paths (for working with batches). It returns a list of dicts - one for each file.

0
Jeff On

Apparently downgrading to exifread 2.x also solves this issue. See comment here: https://github.com/ianare/exif-py/issues/184#issuecomment-1856296220vn