Why are SPARQL queries not working as expected in our Wikidata Docker WDQS

468 Views Asked by At

SPARQL queries not working as expected in our Wikidata Docker WDQS (wikidata query service).

We are running wikibase docker on an AWS EC2. First I will describe the 3 queries that are not working, and then provide details about our setup. We suspect there is a setting in the docker-compose.yml file (at end of post) that is not correct.

Query 1 - no results selected. The query is:

# return item's whose favorite city (P8) is Chicago (Q7)
SELECT ?item ?itemLabel WHERE {
    ?item wdt:P8 wd:Q7 . 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

The query returns “No Matching Records Found” even though we have entered an item Q1, whose favorite city (P8) is Chicago (Q7). Note that the “mouse over” in the WDQS/sparql UI does indicate P8 is favorite city and Q7 is chicago, but the "mouse over" data is coming from elasticsearch and not the WDQS service.

Query 2 - a simpler query that returns results, but the returned itemLabel is the Q number and not the text. The link returned seems correct and does link to the correct item. HOWEVER, if I use a Q number that is not in our wikibase (like Q9999999999) it will still return a link (of course the link will not work because the Q number does not exist).

The query is:

SELECT ?item ?itemLabel
WHERE {VALUES ?item {wd:Q1}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }}

Query 2 and result

Query 3 - A simple query that looks to work except that the links to our wikibase have the value wikibase.svc and not a valid link to our wikibase. Here is the query:

SELECT * WHERE { ?a ?b ?c}

Query 3 and results

In the results I think that ‘wikibase.svc’ should be the url to our wikibase server.

Our setup: we are running wikibase docker on an AWS EC2 (Centos 8) behind an AWS ALB (application load balancer - which also manages SSL)

We use xxxx.xxxx.xxxx.xxxx.edu/sparql/ to access the EC2 on port 8282 and therefore the wdqs-frontend container. (the routing to the port, is managed by the ALB)

xxxx.xxxx.xxxx.xxxx.edu is routed by the ALB to port 8181 on the EC2 and therefore the wikibase container.

For the wdqs-frontend container we have updated /etc/nginx/nginx.conf to include /etc/nginx/conf.d/sparqlpath.conf instead of default.conf.

Sparqlpath.conf is below.

# This file, sparqlpath.conf, replaces the file (default.conf) provided by the wikibase/wdqs-frontend docker image.
# Modifications:
# Joe Troy 01/15/2021 add location /sparql and sparlq_upstream because users will use [servername]/sparql to connect to the sparql frontend
#                     also note that this change required changes to the AWS application Load Balancer (ALB)

upstream sparql_upstream {
    server 127.0.0.1:80;
}

server {
    listen       80;
    server_name  localhost;

    location /sparql {
        # the trailing slash is key as not to look for a slash sub directory
        #include       /etc/nginx/mime.types;
        proxy_pass http://sparql_upstream/;
    }
    location /proxy/wikibase {
        rewrite /proxy/wikibase/(.*) /$1 break;
        proxy_pass http://wikibase.svc:80;
    }
    location /proxy/wdqs {
        rewrite /proxy/wdqs/(.*) /$1 break;
        proxy_pass http://wdqs-proxy.svc:80;
    }
    location / {
        root   /usr/share/nginx/html;
        index  index.html index.htm;
    }
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }

}

And finally below is our docker-compose.yml file

# Wikibase with Query Service
#
# This docker-compose example can be used to pull the images from docker hub.
#
# Examples:
#
# Access Wikibase via "http://localhost:8181"
#   (or "http://$(docker-machine ip):8181" if using docker-machine)
#
# Access Query Service via "http://localhost:8282"
#   (or "http://$(docker-machine ip):8282" if using docker-machine)
version: '3'

services:
  wikibase:
    #image: wikibase/wikibase:1.34-bundle
    #
    #NOTES regarding the authorities.library.illinois.edu implementation
    #The wikibase image is modified so it has settings relevent to authorities.library.illinois.edu
    #if a new version of the wikibase docker image is used
    #changes to the /LocalSettings.php.template should be examined
    #any changes might need to be reflected in ./wikibase/LocalSettings.php.template in this repository
    #
    #Modified to use a Dockerfile
    build:
      context: .
      dockerfile: ./wikibase/Dockerfile
    links:
      - mysql
    ports:
    # CONFIG - Change the 8181 here to expose Wikibase & MediaWiki on a different port
     - "8181:80"
    volumes:
      - mediawiki-images-data:/var/www/html/images
      - quickstatements-data:/quickstatements/data
      - /etc/localtime:/etc/localtime:ro
    depends_on:
    - mysql
    - elasticsearch
    restart: unless-stopped
    networks:
      default:
        aliases:
         - wikibase.svc
         - xxxx.xxxx.xxxx.xxxx.edu
         # CONFIG - Add (added directly above) your real wikibase hostname here, for example wikibase-registry.wmflabs.org
    environment:
      - DB_SERVER=mysql.svc:3306
      - MW_ELASTIC_HOST=elasticsearch.svc
      - MW_ELASTIC_PORT=9200
      # CONFIG - Change the default values below
      - MW_ADMIN_NAME=WikibaseAdmin
      - MW_ADMIN_PASS=${ENV_VAR_WikibaseDockerAdminPass}
      - [email protected]
      - MW_WG_SECRET_KEY=secretkey
      # CONFIG - Change the default values below (should match mysql values in this file)
      - DB_USER=wikiuser
      - DB_PASS=${ENV_VAR_sqlpass}
      - DB_NAME=my_wiki
      - QS_PUBLIC_SCHEME_HOST_AND_PORT=http://localhost:9191
      - SMTP_PASS=${ENV_VAR_sparkpostpass}
  mysql:
    image: mariadb:10.3
    restart: unless-stopped
    volumes:
      - mediawiki-mysql-data:/var/lib/mysql
      - /etc/localtime:/etc/localtime:ro
    environment:
      MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
      # CONFIG - Change the default values below (should match values passed to wikibase)
      MYSQL_DATABASE: 'my_wiki'
      MYSQL_USER: 'wikiuser'
      MYSQL_PASSWORD: '${ENV_VAR_sqlpass}'
    networks:
      default:
        aliases:
         - mysql.svc
  wdqs-frontend:
    #image: wikibase/wdqs-frontend:latest
    build:
      context: .
      dockerfile: ./wdqs-frontend/Dockerfile
    restart: unless-stopped
    ports:
    # CONFIG - Change the 8282 here to expose the Query Service UI on a different port
     - "8282:80"
    depends_on:
    - wdqs-proxy
    networks:
      default:
        aliases:
         - wdqs-frontend.svc
    environment:
      - WIKIBASE_HOST=xxxx.xxxx.xxxx.xxxx.edu
      - WDQS_HOST=wdqs-proxy.svc
    volumes:
      - /etc/localtime:/etc/localtime:ro
  wdqs:
    image: wikibase/wdqs:0.3.10
    restart: unless-stopped
    volumes:
      - query-service-data:/wdqs/data
      - /etc/localtime:/etc/localtime:ro
    tmpfs: /tmp
    command: /runBlazegraph.sh
    networks:
      default:
        aliases:
         - wdqs.svc
    environment:
      - WIKIBASE_HOST=xxxx.xxxx.xxxx.xxxx.edu
      - WIKIBASE_SCHEME=https
      - WDQS_HOST=wdqs.svc
      - WDQS_PORT=9999
    expose:
      - 9999
  wdqs-proxy:
    image: wikibase/wdqs-proxy
    restart: unless-stopped
    environment:
      - PROXY_PASS_HOST=wdqs.svc:9999
    ports:
     - "8989:80"
    depends_on:
    - wdqs
    volumes:
      - /etc/localtime:/etc/localtime:ro
    networks:
      default:
        aliases:
         - wdqs-proxy.svc
  wdqs-updater:
    image: wikibase/wdqs:0.3.10
    restart: unless-stopped
    command: /runUpdate.sh
    depends_on:
    - wdqs
    - wikibase
    networks:
      default:
        aliases:
         - wdqs-updater.svc
    environment:
     - WIKIBASE_HOST=wikibase.svc
     - WIKIBASE_SCHEME=http
     - WDQS_HOST=wdqs.svc
     - WDQS_PORT=9999
    volumes:
      - /etc/localtime:/etc/localtime:ro
  elasticsearch:
    image: wikibase/elasticsearch:6.5.4-extra
    restart: unless-stopped
    networks:
      default:
        aliases:
         - elasticsearch.svc
    environment:
      discovery.type: single-node
      ES_JAVA_OPTS: "-Xms512m -Xmx512m"
    volumes:
      - /etc/localtime:/etc/localtime:ro
  # CONFING, in order to not load quickstatements then remove this entire section
  quickstatements:
    image: wikibase/quickstatements:latest
    ports:
     - "9191:80"
    depends_on:
     - wikibase
    volumes:
     - quickstatements-data:/quickstatements/data
     - /etc/localtime:/etc/localtime:ro
    networks:
      default:
        aliases:
         - quickstatements.svc
    environment:
      - QS_PUBLIC_SCHEME_HOST_AND_PORT=http://localhost:9191
      - WB_PUBLIC_SCHEME_HOST_AND_PORT=http://localhost:8181
      - WIKIBASE_SCHEME_AND_HOST=http://wikibase.svc
      - WB_PROPERTY_NAMESPACE=122
      - "WB_PROPERTY_PREFIX=Property:"
      - WB_ITEM_NAMESPACE=120
      - "WB_ITEM_PREFIX=Item:"

volumes:
  mediawiki-mysql-data:
  mediawiki-images-data:
  query-service-data:
  quickstatements-data:
0

There are 0 best solutions below