Unexpected missing value from Wikidata query

113 Views Asked by At

I'm using the Wikidata query service to learn the SPARQL query language. I'm trying to get information on countries and their identifying information.

Here is a simple query which is intended to return a list of countries (https://www.wikidata.org/wiki/Q6256) along with their ISO 3-letter codes (https://www.wikidata.org/wiki/Property:P298):

SELECT ?country ?countryLabel ?iso
WHERE 
{
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  ?country wdt:P31 wd:Q6256;   # wd:Q6256="country; wd:Q3624078="sovereign state"
           wdt:P298 ?iso.
}
ORDER BY ?countryLabel

I notice that at least one country is consistently missing from the results, Georgia, and I'm confused about why.

According to its wikidata page:

  1. It is an instance of country (wd:Q6256)
  2. It does have an ISO-3166 3-letter country code (wdt:P298)

I've tried various transformations of this query (e.g. don't include the ISO codes, use labels in different languages, etc) and I consistently get the same result: Georgia is missing.

However if I switch from (instance of a country wd:Q6256) to (instance of a sovereign state wd:Q3624078; a subclass of wd:Q6256), then Georgia is included in the results.

I am at a loss to explain this result; the entity in question should be an instance of both "country" and "sovereign state." And clearly it works for most of the other countries of the world, whose data is represented similarly in Wikidata, in that they're listed as instances of both country wd:Q6256 and sovereign state wd:Q3624078.

Can anyone explain what aspect of the SPARQL language, or representation of the data in question, that I'm not understanding here?

1

There are 1 best solutions below

3
On BEST ANSWER

The claim for instanceOf Sovereign State has a PreferredRank, so it's selected in preference to all the other claims which have a NormalRank. Also, SPARQL doesn't do inheritance by default unless you explicitly bake it into the query (because it can be expensive), so you don't automatically get Sovereign State just because it's a subclass of Country.

This will include Georgia

SELECT ?country ?countryLabel ?iso
WHERE 
{
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  ?country p:P31/ps:P31 wd:Q6256;   # wd:Q6256="country; wd:Q3624078="sovereign state"
           wdt:P298 ?iso.
}
ORDER BY ?countryLabel

but note that it includes deprecated claims as well. I cribbed it from this set of examples: https://en.wikibooks.org/wiki/SPARQL/WIKIDATA_Qualifiers,_References_and_Ranks

As mentioned by @horcrux in the comments, you can modify this to exclude deprecated claims by using a FILTER expression:

SELECT ?country ?countryLabel ?iso
WHERE 
{
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  FILTER(?rank != wikibase:DeprecatedRank) . ?country p:P31 [ ps:P31 wd:Q6256 ; wikibase:rank ?rank ] ;
           wdt:P298 ?iso.
}
ORDER BY ?countryLabel

The results are the same in this case, but it's something worth thinking about when you're considering what kind of data you're looking for.