English Wiktionary API: declension table missing in the output

327 Views Asked by At

Yet another English Wiktionary parsing question.

Overall, I am prepared to parse the wikitext format, so the standard API works for me.

The trouble is though that I want to use the English Wiktionary API to obtain the declension tables. For some odd reason, the tables are referenced by codes. Sometimes they are in the output, but in most cases they are missing. E.g. a call to a Russian word like http://en.wiktionary.org/w/api.php?format=xml&action=query&titles=крот&rvprop=content&prop=revisions&redirects=1 yields:

====Declension====
{{ru-noun-table|b|a=an}}

How do I convert it into a full declension table?

I played with a bunch of parameters from here: https://www.mediawiki.org/wiki/API:Query - no result.

One workaround I found is to use the new Wiktionary RESTful API, like this: https://en.wiktionary.org/api/rest_v1/page/html/крот (reference: https://en.wiktionary.org/api/rest_v1/#/). But it only returns HTML, which is more difficult to parse!

Is that the best that can be done?

Is there a special call to the declension tables perhaps? I mean, if it gets generated, there's got to be a way.

1

There are 1 best solutions below

4
On

The table is generated by a Module of wiktionary, namely Module:ru-noun, which is a lua script. It functions like a regular mediawiki template call, the script is contextualized with parameters (b,a=an) and has access to page name (крот).

See "Wikinflection: Massive semi-supervised generation of multilingual inflectional corpus from Wiktionary" for the rational behind this, then the resulting Dictionary builder project.