Virtuoso 42000 Error D1CTX after using curl to retrieve RDF data from dbpedia.org

73 Views Asked by At

I am a new user to the technical world of semantic web, RDF and data retrieval using curl. I am trying to use curl from macOS Terminal.app (macOS Ventura 13.3) to retrieve RDF data in XML format from <http://dbpedia.org/>.

For example:

curl -o Paris-rdf.xml -L -H "Accept: application/rdf+xml" http://dbpedia.org/resource/Paris

This is part of a pre-recorded course exercise, and I noticed than none of the 30 students in the course reported a problem.

When I looked at the XML output file using a text editor (Sublime Text), I had this error statement:

Virtuoso 42000 Error D1CTX: Hash dictionary is full, exceeded 10000 entries

I was surprised for two reasons:

First, I successfully retrieved the data using curl when requesting HTML format on this resource, <http://dbpedia.org/resource/Paris>:

curl -o Paris.html -L -H "Accept: text/html" http://dbpedia.org/resource/Paris

I could also retrieve data in RDF/XML format for other resources.

Second, I understand that Virtuoso is involved with database functionality, handling the management of RDF Data, and supports the SPARQL Query Language, Query Protocol, and XML Query Results Serialization. If Virtuoso is involved with functionality on the server side of dbpedia.org, it's not clear to me what I can configure as a client in to have more success with data retrieval using curl.

Does anyone have an idea why a Virtuoso error is occurring? Any ideas of what steps I should take?

Many thanks for tips or explanation orienting me on any of the topics mentioned.

1

There are 1 best solutions below

0
TallTed On

Tell your instructor you should get bonus points for checking the content of the retrieved file, as none of your classmates got the desired data, either.

We (OpenLink Software) recently added a setting to Virtuoso, for limiting the number of rows in construct queries, which was previously a built-in limit.

It turns out, the default value for the new setting was too small for some of the articles in DBpedia, so we've increased the setting value in the DBpedia virtuoso.ini to:

[SPARQL]
...
MaxConstructTriples   = 100000

If you repeat your actions, you should find that you get the expected RDF/XML document.

ETA: Yes, "Virtuoso is involved with functionality on the server side of dbpedia.org", as described here. There was nothing you could do as a client, but for faster response on future issues like this, they should be raised to either the DBpedia Forum or the OpenLink Community Forum.