How do _search queries work in Elasticsearch?

400 Views Asked by At

The question is more around: "How do Elasticsearch nodes interact to give a specific search result and what is the flow of a search request?"

I've referred to the following links to understand, but they aren't very clear, in what I am trying to understand.

  1. https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest.html
  2. https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html

enter image description here

As per the above documentation,

  1. "Data Nodes" are the ones which perform all the processing when an _search query is invoked.
  2. "Ingest nodes" do some pre-processing before indexing the data.

So, are these above two statements correct? Accordingly,

  1. Do Ingest nodes have any role to perform when an _search query happens?
  2. Do Data Nodes have any role to perform when data is being indexed?
  3. Do any other nodes have any role to perform when data is being searched?

Or if you could help explain the flow of a search request (which node receives the API call, which node filters the data, which node runs the aggregations, etc.), then that would be really helpful.

In case it is relevant, then I am on Elastic Search 7.5

1

There are 1 best solutions below

0
On BEST ANSWER
  1. Do Ingest nodes have any role to perform when an _search query happens? if it's a dedicated ingest node than no, if it also holds the data(shards and replica) than yes.

  2. Do Data Nodes have any role to perform when data is being indexed? Yes, data nodes actually hold the data(shards and replica), and ultimately they are responsible for indexing and searching this data

  3. Do any other nodes have any role to perform when data is being searched? Yes, please refer to the responsibility of co-ordinating role in ES.

In short, ingest node just do the transformation of the data, and data nodes actually hold the data, and all the roles can be dedicated or shared to a node in ES.

Below are the steps in a search request--

  1. Coordinating node receives the request and it can be a dedicated node or data nodes does this work(default).
  2. Coordinating node forwards the request to data nodes, which holds the shards(primary or replica) for your search request.
  3. Data nodes do the local search and send the result back to the coordinating node.
  4. Coordinating node will aggregate the top 10 search results(default is 10) from all nodes and send back the response.