RDF4J SAIL API implementation

204 Views Asked by At

I am trying to build a federated RDF application based on rdf4j and FedX. What I need is to be able to:

  1. Optimize the querying plan and joining strategies.
  2. To expose different and heterogeneous databases (A timeseries or a relational DB for example) in a federated fashion.

I went a little bit through the rdf4j documentation and I got a grasp. And therefore I have some little questions:

  1. Is there any documentation that explains how to implement the SAIL API? I tried to debug and follow the flow of execution of an example query using a RDF memory store and I got lost.
  2. Suppose I want to expose a relational database in my datacenter, Should I implement a SPARQL repository or an HTTP repository? should I in anyway implement the SAIL api?
  3. Concerning fedX, how can I make it possible to use the SERVICE and VALUES terms as proposed in the SPARQL 1.1 federated queries? How can I change the Joning strategies? the query plan?

I know that this can be answered if I dive deeply into the code but I wonder if someone has already exposed some kind of a database using the rdf4j API or even worked and tuned RDF4J.

Thanks to you all!

1

There are 1 best solutions below

1
On

Is there any documentation that explains how to implement the SAIL API? I tried to debug and follow the flow of execution of an example query using a RDF memory store and I got lost.

There is a basic design draft but it's incomplete. A more comprehensive HowTo has been in the planning for a while but it never quite gets the priority it needs.

That said, I don't think you need to implement your own SAIL for what you have in mind. There's plenty of existing implementations that can do what you need.

Suppose I want to expose a relational database in my datacenter, Should I implement a SPARQL repository or an HTTP repository?

I don't understand the question. HTTPRepository is a client-side proxy for an RDF4J Server. SPARQLRepository is a client-side proxy for a (non-RDF4J) SPARQL endpoint. Neither has anything to do with relational database.

should I in anyway implement the SAIL api?

It depends on your use case, but I doubt it - at least not right at the outset. I'd probably use an existing R2RML library that is compatible with RDF4J, like for example the R2RML API, or CARML - either a live mapping or an offline batch mapping between the relational data and your triplestore may solve your problem.

Concerning fedX, how can I make it possible to use the SERVICE and VALUES terms as proposed in the SPARQL 1.1 federated queries?

You don't need to "make it possible" to do that, FedX supports this out of the box.

How can I change the Joning strategies? the query plan?

You can't (at least not easily), nor should you want to. Quite a lot of research and development went into RDF4J's and FedX query planning strategies. I'm not saying either is perfect, but you're unlikely to do better.