Python 3.6
I'm trying to archive some old mails, and I want to remove attachments from some of them.
However, if I use the clear() method, the MIME part remains in the mail, just empty (so it's assumed to be of type text/plain). I came up with a really hacky solution of converting the EmailMessage object to text then removing any boundary lines that aren't followed by headers, but surely there's a better way.
Example mail with two .png inline attachments and two .txt attachments.
Here's a sample:
from email import policy
from email.parser import BytesParser
from email.iterators import _structure
with open(eml_path, 'rb') as fp:
msg = BytesParser(policy=policy.SMTP).parse(fp)
print(_structure(msg))
for part in msg.walk():
cd = part.get_content_disposition()
if cd is not None:
part.clear()
print(_structure(msg))
Structure of original mail:
multipart/mixed
multipart/alternative
text/plain
multipart/related
text/html
image/png
image/png
text/plain
text/plain
Structure after removing attachments:
multipart/mixed
multipart/alternative
text/plain
multipart/related
text/html
text/plain
text/plain
text/plain
text/plain
The last 4 parts are left empty, but I want to remove them.
This causes some graphical issues in Thunderbird and Gmail, from what I've tried. Once I remove the lingering boundary lines, they display correctly.
I think you need to call
set_payload()to modify the structure: