After reading through the Hunspell docs, I started looking at the seemingly most advanced instance of a set of Hunspell dictionary files, and it seems the Hungarian one (Hun-garian Spell) is the most robust.
I have a few questions that seem to be unanswered by the 17 page PDF docs (which are the only real resource on Hunspell it appears, other than the source code).
1. The meaning of the decimal numbers?
For example, the number 1547. We see it here:
AF @ # 1547
And it is used in PFX but not SFX:
PFX r 0 legújra/1547 . 24583
PFX r 0 legújjá/1547 . 24584
PFX r 0 legössze/1547 . 24585
PFX r 0 legát/1547 . 24586
PFX r 0 legáltal/1547 . 24587
PFX r 0 legvégig/1547 . 24588
PFX r 0 legvégbe/1547 . 24589
...
The thing after the slash is a flag as far as I learned, but where is that flag defined? The line AF @ # 1547 has 1547 as a comment, so not sure. Looking further at AF it appears the first line of AF 1548 means there are 1548 AF values that follow, and AF @ is the second to last one in the list, so maybe that's it?!
So then when does the @ symbol mean in regards to AF, which is said to be:
Hunspell can substitute affix flag sets with ordinal numbers in affix rules (alias compression, see
makealiastool).
I'm not following....
2. The meaning of the last decimal numbers on PFX?
Like we have from above:
PFX r 0 legát/1547 . 24586
That is the only place 24586 appears in the .aff file. So what does it mean? Same for all the numbers in that position. Line #24586 in the .dic file doesn't seem related either:
lódenkabát/39   1
What do the /number mean in the .dic file?
Regarding that last example:
lódenkabát/39   1
What does /39 and the 1 mean? Where are those defined, I would have assumed to find a PFX 39 or SFX 39 defined in the .aff file, but I don't seem to see that.
                        
Learned more by looking at the tests around alias2.aff (and other alias2 files):
Files
alias2.aff:
alias2.dic:
alias2.good:
alias2.morph:
Explanation
Explaining the
AMStands for "morphological alias"?
So this is saying we are dealing with line numbers relative to when the
AMandAFstart! That is crazy to me, so brittle. But anyways....That
1is referring toAM morphological_fields(from the docs). So it is marking this suffix asAM 1which is the first AM:is:affix_x. That corresponds to ouralias2.morphfile, where it shows:Notice the
is:affix_x.Now,
fooxhas more. This is because in the.dicfile, it says:That
3is pointing to another AM, which is the last one.So that gives us all three of the AMs shown in the
alias2.morph:Explaining the
AFStands for "affix flag".
The
/1here in the.dicreferences the AF position:And the
/2in the.affdoes as well:So for the
y/2, that is saying thatycan come after suffixx, since2links toAF 2which isAF A, which is linking toSFX A, which is thexsuffix.I'm a bit confused at
foo/1, which is an alias tofoo/AB, couldn't you just writefoo/Aand it knows to allowfoo/ABbecause of they/2definition? Orfoo/1/foo/ABmust be sayingfoo/A and foo/B allowed, butfoo/Bis only allowed afterfoo/A, as per theSFX Bdefinition. That must be it.