I have these two files that run perfectly fine within GF shell
My GF code
Test.gf
abstract Test = {
cat
Sentence; Noun;
fun
MySentence : Noun -> Sentence;
}
TestEng.gf
concrete TestEng of Test = open SyntaxEng, ParadigmsEng, DictEng in {
lincat
Sentence = NP;
Noun = N;
lin
MySentence noun = mkNP (aSg_Det) (noun);
}
The way I run them in GF shell is as follow:
> i -retain TestEng.gf
> cc -one MySentence dog_N
a dog
Which gives the expected result.
My PGF code
Then I used Linux command to translate this file into `.pgf' format using the command
> gf -make --output-format=haskell TestEng.gf
linking ... OK
Writing Test.pgf...
Writing Test.hs...
which output these two files Test.hs and Test.pgf
Question
My Python code
test.py
import pgf
gr = pgf.readPGF("Test.pgf")
e = pgf.readExpr("MySentence dog_N")
print(gr.languages.keys()) #To check all languages
eng = gr.languages["TestEng"]
print(eng.linearize(e))
When I run the above code I get the following output:
> python3 test.py
dict_keys(['TestEng'])
a [dog_N]
Why python output a [dog_N] and not a dog?
I will first give you three alternatives how to make the grammar work. Then I will explain the rest of the mysteries: why
ccworks with your initial approach, but parsing/linearisation doesn't, and also how to actually useccfrom Python (just not with the PGF library).1. Fixing your grammar
(a) Large lexicon, application grammar as a layer on top of RGL
In your example, you are opening
DictEng, so I assume that you would like your application to have a large lexicon.If you want to be able to parse with a large lexicon, it needs to be a part of the abstract syntax of your grammar. The first mistake is the fact that you're opening DictEng as a resource, instead of extending. (See tutorial to refresh your memory.)
So if you want your abstract syntax to contain a lexicon entry called
dog_N, which you can give as an argument to the functionMySentence, you will need to modify your grammar as follows.Abstract:
Concrete:
In this solution, I'm keeping the constraint that
dog_Nhas to be correct, and changing everything else. So the changes are:NounandSentence)—instead, inherit the Cat module from the RGL abstract syntax.MySentenceworks now on the RGL catsNandNP. In your original approach, these were the lincats of your custom cats.So this grammar is an extension of a fragment of the RGL. In particular, we are reusing RGL types and lexicon, but none of the syntactic functions.
(In fact, we are also using RGL syntactic functions, but via the API, not via extending the RGL abstract syntax! The
mkNPoper comes from the RGL API, and we have it in scope because we open SyntaxEng in the concrete syntax.)(b) Small lexicon, pure application grammar, RGL is only a resource
Here I decide to keep your custom cats and their lincats. This means that I need to add lexicon explicitly. Like this:
If I don't extend DictEngAbs, like in the previous approach, and I want to have something in scope that is called
dog_N, I must create it myself. In order to be able to parse or linearise anything, it must be in the abstract syntax.So in the concrete, we are opening DictEng again, and using it to linearise the lexical items of this abstract syntax.
Of course, if you want a large lexicon, this is not so useful. But if you actually didn't care about a large lexicon, this results in the simplest and smallest grammar. RGL is just used as a resource, strictly via the API. As we can see, this grammar just opens SyntaxEng and DictEng, all cats and funs it has are defined in the abstract. Nothing hidden, nothing surprising, no bulk. Also, no coverage, this grammar can literally just say "a dog" and "a cat".
(c) Large lexicon, extend RGL but keep your custom cats too
This is effectively the same as solution (a), but I'm just showing how to extend the RGL fragment and keep your custom cats, if you wanted to do that.
Here's the abstract syntax.
We start again by extending Cat and DictEngAbs. But we also define our own cats. The reason why it works is our coercion function,
n2noun : N -> Noun.Nin DictEngAbs toNoun. Because MySentence only accepts Nouns.So
n2noundoes the conversion for us. Here's the concrete syntax:The syntax trees look like this now:
If you prefer the syntax trees shorter, just
MySentence hairstylist_N, then go for solution (a).I can't think of concrete benefits compared to (a) for such a small example, but in a larger system, it can be useful for adding restrictions. For instance, suppose you added more
Nouns from another source, and didn't want to give them as arguments to other functions, but the RGLNs can be arguments to all functions, then it's useful to have a separation of cats with coercion functions.2. Remaining questions
I touched on some things already in my three alternative solutions, but there are still issues I didn't explain. Here are the rest of the mysteries.
Why did your first approach work in GF shell but not in Python?
Because you didn't try to parse and linearise, instead you used
cc(compute concrete) with-retainflag. When you open a grammar in GF shell with-retain, it keeps all local, temporary helper stuff in scope—this includes modules that you open. Sodog_NfromDictEngwas in scope, but only for cc in the GF shell.Did you try to parse and linearise in the GF shell? If you try, you will already run into failure there:
In contrast to
cc, parsing and linearisation cannot depend on local definitions. It has to be in the abstract syntax, otherwise it doesn't exist. And if you want to access a grammar from Python using the PGF library, then the grammar must be compiled into the PGF format. And the PGF format doesn't retain local definitions.Actually using
ccfrom PythonTechnically, you can use cc from Python, but not using the PGF library. It works if you open GF shell as a subprocess, and give it the uncompiled GF file as input. This works, I put it in a file called
test.py:And running it on the command line, with your original grammar in the same directory, gives me the desired answer.
Remember, you can't parse anything with
cc, not in GF shell, not from Python subprocess. It's just for generating output.Compilation to PGF
Final minor nitpick: you don't need the flag
--output-format=haskellif you don't need a Haskell version of the abstract syntax. Justgf -make TestEng.gfis enough to produce the PGF.