Vespa counts the number of times a word appears in a string

Question

Vespa counts the number of times a word appears in a string

79 Views Asked by igor tarashchuk At 22 September 2023 at 14:44

I have a question, i can try to find, how to calculate number of times, when searched word appears in a string. For example, have schema like this.

schema search {
    document search {
        field Id type string {
            indexing: summary | attribute
            attribute: fast-search
        }
        field Name type string {
            indexing:  summary | attribute
        
        }
    field NameArray type array<string> {
            indexing:   summary | attribute 
        }
    field NameLength type int {
            indexing:  summary | attribute 
        }
    }

    fieldset default {
        fields: Id, Name
    }

    rank-profile default {
        first-phase {
            expression: nativeRank(Id, Name)
        }
    }

rank-profile searchByName {
        first-phase {
            expression: matchCount(Name)
        }
    }

rank-profile searchByName1 {
    first-phase {
        expression: matchCount(NameArray)
    }


}

Example of document

{
                "id": "id:search:search::AjSjRtrcoklrHHb",
                "relevance": 1,
                "source": "search",
                "fields": {
                    "sddocname": "search",
                    "documentid": "id:search:search::AjSjRtrcoklrHHb",
                    "Id": "AjSjRtrcoklrHHb",
                    "Name": "Test Cat сat Cat",
                    "NameArray": [
                        "Test",
                        "Cat",
                        "сat",
                        "Cat"
                    ],
                    "NameLength": 16
                }
            }



{
    "hits": 150,
    "ranking": {
        
        "profile": "searchByName "
        
    },
    "offset": 0,
    "yql": "select * from search where Name matches '(?i).*сat.*'' "
}

When i send request to Vespa, it give me every time value of matchCount = 1 ( in relevance). The same result for ranking searchByName and searchByName1. How to calculate number of appears keywords "cat" in Name or using array of name. Also can calculate, when i can try use for keyword - "ca".

Try to use index on field Name, textSimilarity(Name).queryCoverage and textSimilarity(Name).fieldCoverage functio for this

Original Q&A

There are 1 best solutions below

**Jon** · Answer 1 · 2023-09-22T15:25:48.143000

Rank features such as matchCount operates on tokens. Here you make the fields attributes, not (text) indexes, and then you just have a single value which is not split into tokens.

Here you are doing regex matching on that value rather than just exact match, but Vespa will anyway just count it as a single match if the regex matches.

Looks like you could just use a text index instead and just query by "contains 'cat'"?

Vespa counts the number of times a word appears in a string

There are 1 best solutions below

Related Questions in FULL-TEXT-SEARCH

Related Questions in VESPA

Trending Questions

Popular # Hahtags

Popular Questions