Sparql: Counting the number of occurrences of a specific string/charater/diacritic in text using owlready2

42 Views Asked by At

I have an ontology in an 'owl' file (quran_data_full.owl) and I saved it in a folder in my google drive (Quran Corpus). To perform some queries in this ontology, first I tried the query on Apache Jena Fuseki. It gave me the correct results so I copied the query into my code in google collab but unfortunately it gives an error as one function in the query was not supported by owlready2.

I wrote a query to search for a verse that contains a specific text and count any characters or even any diacritics in that verse.

This is my code in google collab:

from owlready2 import *
onto_path.append("/gdrive/MyDrive/Quran Corpus")
go = get_ontology("/gdrive/MyDrive/Quran Corpus/quran_data_full.owl").load()
obo = get_namespace("/gdrive/MyDrive/Quran Corpus/")
d = list(default_world.sparql("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX qur: <http://quranontology.com/Resource/>

SELECT ?textSimple ?cnt
WHERE {
?verse rdf:type qur:Verse.
?verse qur:DisplayText ?o.
?verse rdfs:label ?textSimple.
    
FILTER ((REGEX(STR(?textSimple), "قل هو الله أحد" ,"i"))).    
  filter(regex( str(?textSimple) ,"ه")) 
  bind(
    strlen(
        replace(
            replace(str(?textSimple), "ه", "#")
            , "[^#]", ""
        )
    ) as ?cnt) 
}
"""))

This is the output:

OperationalError: user-defined function raised exception

I read the documentation here about the replace function, they mentioned the following:

The following functions are supported by Owlready, but not standard: The SIMPLEREPLACE(a, b) function is a version of REPLACE() that does not support Regex. It works like Python or SQLite3 replace, and has better performances.

What are the possible alternatives to count every occurrence for any characters or symbols in owlready2 without using replace?

This is the result from running the query on Apache Jena Fuseki:

enter image description here

You can see the full content of quran_data_full.owl file here.

0

There are 0 best solutions below