Using http-conduit
I want to download the raw wikimedia markup for any page, for example the Wikipedia page Stack Overflow
.
Also, I'd like the solution to be applicable to wikimedia pages other than en.wikipedia.org
, for example de.wikibooks.org
.
Note: This question was immediately answered in Q&A form and therefore intentionally does not show research effort!
This question uses query parameters in http-conduits as described in this previous SO answer.
We will use the method described here on SO to download the markup content of a page.
Although this task could be possible using the mediawiki, it seems significantly simpler to use the
?action=raw
method without explicitly using the API.In order to support different pages (e.g.
en.wikimedia.org
), I wrote two functionsgetWikipediaPageMarkup
andgetEnwikiPageMarkup
, the former one being more general and allowing to use custom domains (any domain should work, assuming Mediawiki is installed under/wiki
).Note that a recent
http-conduit
version is required (minimum:2.1
, tested with2.1.4
) in order to compile the code.