I have to create a DCG in Prolog with the following features:
- handle subject/object distinction
- singular/plural distinction
- capable of producing parse trees
- make use of a separate lexicon
Here's the given lexicon:
lex(the,det,_).
lex(a,det,singular).
lex(man,n,singular).
lex(men,n,plural).
lex(woman,n,singular).
lex(women,n,plural).
lex(apple,n,singular).
lex(apples,n,plural).
lex(pear,n,singular).
lex(pears,n,plural).
lex(eat,v,plural).
lex(eats,v,singular).
lex(know,v,plural).
lex(knows,v,singular).
lex(i,pronoun,singular,subject).
lex(we,pronoun,plural,subject).
lex(me,pronoun,singular,object).
lex(us,pronoun,plural,object).
lex(you,pronoun,_,_).
lex(he,pronoun,singular,subject).
lex(she,pronoun,singular,subject).
lex(him,pronoun,singular,object).
lex(her,pronoun,singular,object).
lex(they,pronoun,plural,subject).
lex(them,pronoun,plural,object).
lex(it,pronoun,singular,_).
And here's my code:
s(s(NP,VP)) --> np(NP,X,subject), vp(VP,X).
np(np(DET,N),X,_) --> det(DET,X), n(N,X).
np(np(PRO),X,Y) --> pro(PRO,X,Y).
vp(vp(V,NP),X) --> v(V,X), np(NP,_,object).
vp(vp(V),X) --> v(V,X).
det(det(DET),X) --> [DET], {lex(DET,det,X)}.
n(n(N),X) --> [N], {lex(N,n,X)}.
pro(pro(PRO),X,Y) --> [PRO], {lex(PRO,pro,X,Y)}.
v(v(V),X) --> [V], {lex(V,v,X)}.
When I input:
s(X, [the, man, eats, the, apple], []).
I should get:
X = s(np(det(the, singular), n(man, singular, subject)), vp(v(eats, singular), np(det(the, singular), n(apple, singular, object))))
But instead I get:
X = s(np(det(the), n(man)), vp(v(eats), np(det(the), n(apple))))
And I'm not sure why it's not outputting the full thing.
Calling DCGs like
is somewhat "old style" still taught by many sources, but the more "modern" way is to use
phrase/[2,3]
instead:The two-argument form saves you from specifying the rest as
[]
in the common case where you want a full parse. You also separate the arguments of the DCG rule from the list to be parsed. So in thephrase
call,s
takes a single argument for the syntax tree, just like in its definition.As for your problem, the good thing is that Prolog is a very testable language. If something big -- like parsing a whole sentence -- goes wrong, we can break the problem down and test smaller bits -- like parsing a noun phrase, or just a noun.
So, breaking the subject down into smaller and smaller parts:
You would like to parse the noun phrase to
np(det(the, singular), n(man, singular, subject))
, but the actual trees you get fromdet
andn
are already missing some of the extra arguments. You need to adjust these:With this you get:
The parse for the whole sentence is now:
The extra arguments on the determiners and nouns are there now. What's missing is to do the same for verbs and for noun phrases, so that the "role" (subject or object) is bound correctly.