Extraction german verb conjugations from xml dump templates

88 Views Asked by At

I would like to get a normalized list of German verb conjugations starting out withe the wiktionary XML dump.

I think I can manage to parse the XML dump, but I don't understand how wiktionary translates a Flexion template into a normalized display like for instance https://de.wiktionary.org/wiki/Flexion:lesen

which seems to be expanded from:

{{Deutsch Verb unregelmäßig|2=les|3=las|4=läs|5=gelesen|6=lies|7=-s|8=i|vp=ja|zp=nein|gerund=ja}}

Pointers to this normalization code would be hugely appreciated. I found a number of XML parsers for wiktionary on GitHub, but none seem to cover verb conjugations, and other question about wiktionary don't seem to cover this either.

Many thanks in advance

0

There are 0 best solutions below