Parse mailto urls in Python

Question

Parse mailto urls in Python

2.9k Views Asked by Yarin At 30 January 2012 at 16:59

I'm trying to parse mailto URLs into a nice object or dictionary which includes subject, body, etc. I can't seem to find a library or class that achieves this- Do you know of any?

mailto:[email protected]?subject=mysubject&body=mybody

Original Q&A

There are 8 best solutions below

Ferdinand Beyer On 30 January 2012 at 17:10

Batteries included: urlparse.

Alien Life Form On 30 January 2012 at 17:11

The core urlparse lib does less than a stellar job on mailtos, but gets you halfway there:

In [3]: from urlparse import urlparse

In [4]: urlparse("mailto:[email protected]?subject=mysubject&body=mybody")
Out[4]: ParseResult(scheme='mailto', netloc='', path='[email protected]?subject=mysubject&body=mybody', params='', query='', fragment='')

EDIT

A little research unearths this thread. Bottom line: python url parsing sucks.

jfs On 30 January 2012 at 17:32

import urllib

query = 'mailto:[email protected]?subject=mysubject&body=mybody'.partition('?')[2]
print dict((urllib.unquote(s).decode('utf-8') for s in pair.partition('=')[::2])
           for pair in query.split('&'))
# -> {u'body': u'mybody', u'subject': u'mysubject'}

CoffeeRain On 30 January 2012 at 17:34

Here is a solution using the re module...

import re

d={}
def parse_mailto(a):
  m=re.search('mailto:.+?@.+\\..+?', a)
  email=m.group()[7:-1]
  m=re.search('@.+?\\..+?\\?subject=.+?&', a)
  subject=m.group()[19:-1]
  m=re.search('&.+?=.+', a)
  body=m.group()[6:]

  d['email']=email
  d['subject']=subject
  d['body']=body

This assumes it is in the same format as you posted. You may need to make modifications to better fit your needs.

Vitold S. On 14 May 2015 at 20:17

You shold use special library like that

https://pypi.python.org/pypi/urlinfo

and contribute and create issue to make Python better ;)

P.S. Does not use Robbert Peters solution bcz it hack and does not work properly. Also using a regular expression is using super BFG Gun to get small bird.

Alexander Holmbäck On 30 July 2015 at 20:56

You can use urlparse and parse_qs to parse urls with mailto as scheme. Be aware though that according to scheme definition:

mailto:[email protected],[email protected]?subject=mysubject

is identical to

mailto:[email protected]&[email protected]&subject=mysubject

Here's an example:

from urlparse import urlparse, parse_qs
from email.message import Message

url = 'mailto:[email protected]?subject=mysubject&body=mybody&[email protected]'
msg = Message()
parsed_url = urlparse(url)

header = parse_qs(parsed_url.query)
header['to'] = header.get('to', []) + parsed_url.path.split(',')

for k,v in header.iteritems():
    msg[k] = ', '.join(v)

print msg.as_string()

# Will print:
# body: mybody
# to: [email protected], [email protected]
# subject: mysubject

Louis Maddox On 23 February 2023 at 18:13

I like Alexander's answer but it is in Python 2! We now get urlparse() and parse_qs() from urllib.parse. Also note that sorting the header in reverse puts it in the order: to, from, body.

from email.message import Message
from pathlib import Path
from urllib.parse import parse_qs, urlparse

url = Path("link.txt").read_text()
msg = Message()
parsed_url = urlparse(url)
header = parse_qs(parsed_url.query)
header["to"] = header.get("to", []) + parsed_url.path.split(",")

for k, v in sorted(header.items(), reverse=True):
    print(f"{k}:", v[0])

I am just using this as a one-off, when I used msg.as_string() I got some strange results though so I just went with the string. The values are lists of one value so I access the 0'th entry to make it a string.

**Robert Peters** · Accepted Answer · 2012-01-30T17:05:06.403000

Seems like you might just want to write your own function to do this.

Edit: Here is a sample function (written by a python noob).

Edit 2, cleanup do to feedback:

from urllib import unquote
test_mailto = 'mailto:[email protected]?subject=mysubject&body=mybody'

def parse_mailto(mailto):
   result = dict()
   colon_split = mailto.split(':',1)
   quest_split = colon_split[1].split('?',1)
   result['email'] = quest_split[0]

   for pair in quest_split[1].split('&'):
      name = unquote(pair.split('=')[0])
      value = unquote(pair.split('=')[1])
      result[name] = value

   return result

print parse_mailto(test_mailto)

Parse mailto urls in Python

There are 8 best solutions below

Related Questions in PYTHON

Related Questions in MAILTO

Related Questions in URL-PARSING

Trending Questions

Popular # Hahtags

Popular Questions