For example, this page https://www.bobdc.com/blog/json-ld/ , when viewing page source, there is:
<html>
<head>
<script type="application/ld+json">
{
"@context" : "http://schema.org",
"@type" : "BlogPosting",
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https:\/\/www.bobdc.com\/"
},
"articleSection" : "blog",
"name" : "Exploring JSON-LD",
"headline" : "Exploring JSON-LD",
"description" : "And of course, querying it with SPARQL.",
"inLanguage" : "en",
"author" : "Bob DuCharme",
"creator" : "",
"publisher": "",
"accountablePerson" : "",
"copyrightHolder" : "",
"copyrightYear" : "2019",
"datePublished": "2019-04-21 11:20:00 \u002b0000 UTC",
"dateModified" : "2019-04-21 11:20:00 \u002b0000 UTC",
"url" : "https:\/\/www.bobdc.com\/blog\/json-ld\/",
"wordCount" : "1283",
"keywords" : [ "RDF","JSON","SPARQL","Blog" ]
}
</script>
......
Can we use SPARQL query against the page directly? If not, are there some elegant workarounds?
I googled without satisfying results. Thank you in advance!
This is not possible with plain SPARQL. One needs to preprocess the page and load the JSON-LD into some kind of in-memory triplestore, as suggested by @UninformedUser in the comments. However, one does not need to do that manually, but could use some ready made tools for that:
SPARQL Anything
It overloads the SPARQL SERVICE operator to parse many kinds of files from web or local storage. In your case, create a following query file
json-ld-in-html.rq:Then execute the query:
Result:
With a few changes, it is also possible to provide the URL as parameter or return the output in another format.
It is also possible to run SPARQL anything as a web service and send the query via HTTP/SPARQL protocol.