Prioritized search results?

107 Views Asked by At

we've been using StormCrawler with Elasticsearch to index our own websites for a couple of years. I was wondering if we can tweak the search results to have certain pages come up at the top of the results? For example, a specific search keyword would bring a particular page to the top of the results instead of further down the list. The keyword metadata field in the HTML page seems like the place to do this but it appears stormcrawler ignores it for prioritizing results? Any ideas are appreciated.

Thanks.

Edited: The search is on the content field in Elasticsearch:

http://elasticserver:9200/_search?q=content:covid

It would be nice to also query on a keywords field perhaps.

The standard content view is used for the most part:

curl $ESCREDENTIALS -s -XPUT $ESHOST/content -H 'Content-Type: application/json' -d 
{
    "settings": {
            "index": {
                    "number_of_shards": 5,
                    "number_of_replicas": 0,
                    "refresh_interval": "60s"
            }
    },
    "mappings": {
                    "_source": {
                            "enabled": true
                    },
                    "properties": {
                            "content": {
                                    "type": "text",
                                    "index": "true",
                                    "store": true
                            },
                            "host": {
                                    "type": "keyword",
                                    "index": "true",
                                    "store": true
                            },
                            "title": {
                                    "type": "text",
                                    "index": "true",
                                    "store": true
                            },
                            "url": {
                                    "type": "keyword",
                                    "index": "false",
                                    "store": true
                            },
                            "collections": {
                                    "type": "keyword",
                                    "index": "true",
                                    "store": true
                            },
                            "last_modified": {
                                    "type": "date",
                                    "index": "false",
                                    "store": true
                            },
                            "content_length": {
                                    "type": "integer",
                                    "index": "false",
                                    "store": true
                            }
                        }
    }

}'

1

There are 1 best solutions below

0
On

stormcrawler ignores it for prioritizing results

SC does not handle the search, this is entirely yours to manage in any way you see fit. SC populates the content index, you can then query it to your heart's content. Want to use keywords? Sure, query the index with a boolean query containing a keywords:* query.

Maybe you use Kibana to display the result? It is useful for debugging but most people tend to query and display the results from ES using their own UI and building the queries to include all the fields they have.