Parsoid - parse wikitext locally

396 Views Asked by At

Is that even possible?

I am not sure, if I understand the project properly. I am trying to parse a big amount of wikitext into html using the Parsoid-JSAPI project.

Parsing works fine, but it is still calling the wikimedia API. I have run the server localy, but the library is still using the public internet API instead of my local server. If i try to specify domain, calling Parsoid.parse("wikitext", {domain: 'localhost'}), it says No API URI available for prefix: null; domain: localhost

My config.yaml:

mwApis:
    uri: 'http://localhost/w/api.php'
    domain: 'localhost'
1

There are 1 best solutions below

1
On

Parsing wikitext is possible, sure; that's what Parsoid does. Parsing Wikipedia content is not possible (without API calls) as 1) templates and other transcluded content needs to be resolved and 2) some of the markup is managed by extensions and Parsoid defers to them.

You can set up a local MediaWiki instance, set up all the required extensions, and import all the relevant pages (there is an "include templates" option when exporting content) but it's a lot of effort.