How to turn HTML into Markdown with Python with support for footnotes?

213 Views Asked by At

I'm using Aaron Swartz's HTML2Text.py script to turn HTML into Markdown on my web app. However, it doesn't support footnotes (the <sup> tag is being removed). I want to make it functional with support for footnotes but I can't figure out what to do.

I tried this code, but it doesn't seem to work (I added self.sup = 0 at the start of the script):

 if tag == "sup":
        if start:
            self.p(); self.o('[^] ', 0, 1); self.start = 1
            self.sup += 1
        else:
            self.sup -= 1
            self.p()

and also just:

if tag == "sup":
    self.sup()

The issue is that the <sup> tags are being removed entirely, <li> tags lose their id and <a> tags lose their rel so I get non working links.

Can anyone help me add support for the <sup> tag and footnotes in this script?

The script is available here (It's too long to post here). I'm using Python 2.7.9.

Thanks :)

UPDATE: with this code:

        if tag == "sup" and start:
        if has_key(attrs, 'id'):
            id = attrs.get('id', '').replace("fnref:", "")
            self.o("[^" + escape_md(id) + "]")

It renders the <sup> but it doesn't include the <a> inside or connects it to the <li> at the bottom with the actual footnote.

0

There are 0 best solutions below