I have a question, i can try to find, how to calculate number of times, when searched word appears in a string. For example, have schema like this.
schema search {
document search {
field Id type string {
indexing: summary | attribute
attribute: fast-search
}
field Name type string {
indexing: summary | attribute
}
field NameArray type array<string> {
indexing: summary | attribute
}
field NameLength type int {
indexing: summary | attribute
}
}
fieldset default {
fields: Id, Name
}
rank-profile default {
first-phase {
expression: nativeRank(Id, Name)
}
}
rank-profile searchByName {
first-phase {
expression: matchCount(Name)
}
}
rank-profile searchByName1 {
first-phase {
expression: matchCount(NameArray)
}
}
Example of document
{
"id": "id:search:search::AjSjRtrcoklrHHb",
"relevance": 1,
"source": "search",
"fields": {
"sddocname": "search",
"documentid": "id:search:search::AjSjRtrcoklrHHb",
"Id": "AjSjRtrcoklrHHb",
"Name": "Test Cat сat Cat",
"NameArray": [
"Test",
"Cat",
"сat",
"Cat"
],
"NameLength": 16
}
}
{
"hits": 150,
"ranking": {
"profile": "searchByName "
},
"offset": 0,
"yql": "select * from search where Name matches '(?i).*сat.*'' "
}
When i send request to Vespa, it give me every time value of matchCount = 1 ( in relevance). The same result for ranking searchByName and searchByName1. How to calculate number of appears keywords "cat" in Name or using array of name. Also can calculate, when i can try use for keyword - "ca".
Try to use index on field Name, textSimilarity(Name).queryCoverage and textSimilarity(Name).fieldCoverage functio for this
Rank features such as matchCount operates on tokens. Here you make the fields attributes, not (text) indexes, and then you just have a single value which is not split into tokens.
Here you are doing regex matching on that value rather than just exact match, but Vespa will anyway just count it as a single match if the regex matches.
Looks like you could just use a text index instead and just query by "contains 'cat'"?