Querying the Linked Movie Database (LMDB) with SPARQL

1.4k Views Asked by At

Given an RDF graph like this:

:Matrix [rdfs:label] :The Matrix .
:Matrix [movie:id] :23 .
:Matrix [movie:actor] :Keanu Reaves .
:Matrix [movie:actor] :Laurence Fishburne .
:Die Hard 3 [rdfs:label] :Die Hard 3 .
:Die Hard 3 [movie:id] :42 .
:Die Hard 3 [movie:actor] :Bruce Willis .
:Die Hard 3 [movie:actor] :Samuel L. Jackson .

and a query like this:

SELECT ?id ?name ?actor
WHERE {
  ?instance movie:id ?id .
  ?instance rdfs:label ?name .
  ?instance movie:actor ?actor .
}

I would expect a result like:

id | name       | actor
23 | The Matrix | Laurence Fishburne
23 | The Matrix | Keanu Reaves
42 | Die Hard 3 | Bruce Willis
42 | Die Hard 3 | Samuel L. Jackson

but instead I only get:

id | name       | actor
23 | The Matrix | Laurence Fishburne
42 | Die Hard 3 | Bruce Willis

What is the matter with that?

By the way, when I use this query:

SELECT *
WHERE {
  ?instance movie:id ?id .
  ?instance rdfs:label "The Matrix" .
  ?instance movie:actor ?actor .
}

The result is (as expected):

id | name       | actor
23 | The Matrix | Laurence Fishburne
23 | The Matrix | Keanu Reaves
1

There are 1 best solutions below

0
On

Using Jena's ARQ I was able to use the following query to get the sort of data you seem to be interested in from the Linked Movie DataBase SPARQL endpoint:

PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX movie: <http://data.linkedmdb.org/resource/movie/>

SELECT ?id ?filmTitle ?actorName WHERE { 
  VALUES ?filmTitle { "The Matrix" }
  SERVICE <http://data.linkedmdb.org/sparql> {
    ?film a movie:film ;
          movie:filmid ?id ;
          dcterms:title ?filmTitle ;
          movie:actor [ a movie:actor ;
                        movie:actor_name ?actorName ].
  }
}

data.n3 is an empty file, since arq requires a --data argument, even though the SERVICE keyword means that that we're querying remote data.

$ arq --query query.sparql --data data.n3 
-----------------------------------------------------------------------------------------
| id                                              | filmTitle    | actorName            |
=========================================================================================
| "38146"^^<http://www.w3.org/2001/XMLSchema#int> | "The Matrix" | "Keanu Reeves"       |
| "38146"^^<http://www.w3.org/2001/XMLSchema#int> | "The Matrix" | "Laurence Fishburne" |
| "38146"^^<http://www.w3.org/2001/XMLSchema#int> | "The Matrix" | "Hugo Weaving"       |
| "38146"^^<http://www.w3.org/2001/XMLSchema#int> | "The Matrix" | "Joe Pantoliano"     |
| "38146"^^<http://www.w3.org/2001/XMLSchema#int> | "The Matrix" | "Gloria Foster"      |
| "38146"^^<http://www.w3.org/2001/XMLSchema#int> | "The Matrix" | "Carrie-Anne Moss"   |
-----------------------------------------------------------------------------------------

Removing the VALUES ?filmTitle ... line broadens the search to all movies and their actors, of course.

The properties used in your query are different enough from the actual properties used in the LMDB that it's hard to see what themeaningful different might have been. It could have been rdfs:label instead of dcterms:title, or strings with or without language tags, and so on.