Using RDF to model normal sentences

583 Views Asked by At

I'm trying to somehow store everyday sentences -- or rather the information expressed by the sentences -- in a (semi-)structured manner. Right now, I'm exploring the feasibility of RDF for that. I'm not familiar with RDF enough to assess if this is suitable way to go. While I'm sure that there will be some form of information loss, I cannot say it it would be acceptable for practicable purposes.

Obviously, sentences like "Bob ate the cake" can be directly mapped into subject-predicate-object triple (although I'm don't know how to properly address the tense of the predicate):

:Bob :ate :Cake .

I then came across reification that converts a single triple as 4-triple graph representing a statement with a subject, predicate and object node. This allows to reflect statements over statements, e.g., "Alice thinks that Bob ate the cake.":

_:stmt1 rdf:type rdf:Statement .
_:stmt1 rdf:subject :Bob .
_:stmt1 rdf:predicate :ate .
_:stmt1 rdf:object :Cake .
:Alice :thinks _:stmt1 .

So far so good. However, everyday language has so many more constructs. Here are just some of the more obvious (to me):

  • Conditions: "If Alice tells the truth, Bob ate the cake."

  • Auxiliary verbs: "Alice can bake cakes."

  • Clausal complements: "Alice likes to eat cake."

  • Negation: Given the Open World Assumption, the absence of a triple :Bob :ate :Cake . is ambigious. In many practical settings I would argue that making negation explicit is important.

  • Disjunction: "(Either) Alice or Bob ate the cake." Most people do not reliably distinguish between OR and XOR in everyday language.

  • Quantifier: "Most people like cake.", "Very few people hate cake."

I assume some can be realized in a relatively straightforward manner. For example, "Alice thinks that Bob did not eat the cake" could be represented as

_:stmt1 rdf:type rdf:Statement .
_:stmt1 rdf:subject :Bob .
_:stmt1 rdf:predicate _:predA .
_:predA :term :eat
_:predA :tense :past
_:predA :negated :true
_:stmt1 rdf:object :Cake .
:Alice :thinks _:stmt1 .

This might also allow to express simple adverbs (e.g. "Bob quickly ate the cake") by having a triple such as

_:predA :advmod :quickly

On the other hand, reification alone quickly increases the number of required triples. I guess you can push it to the extreme and consider output of an dependency parser as a set of triples, e.g.:

nsubj(thinks, Alice)
ccomp(thinks, ate)
nsubj(ate, Bob)
dobj(ate, cake)
...

but I cannot see this truly useful when it comes to query the graph or infer from it.

I read a couple of scientific papers that focus on converting text into RDF, but most of them focus the extraction of simple facts. Apart from that, I couldn't find any good resources on how useful/expressive/practical RDF is to represent knowledge beyond simple factoids.

0

There are 0 best solutions below