A lot happens when a user invokes a search query against Elasticsearch. Although we touched on the mechanics earlier, let’s recap what we learned. The figure below shows the mechanics of how a search is carried out by the engine in the background.
When a search request is received from a user or a client, the engine forwards that request to one of the available nodes in the cluster. Every node in the cluster is, by default, assigned to a coordinator role; hence, making every node eligible for picking up the client requests on a round-robin basis. Once the request reaches the coordinator node, it then determines the nodes on which the shards of the concerned documents exist.
In the figure above, Node A is the coordinator node, where it receives the request from the client. It is chosen as a coordinator node for no specific reason other than demonstration purposes. Once it is chosen as the (coordinator) active role, it creates a replication group with a set of shards and replicas on individual nodes in a cluster that consists of the data. Remember, an index is made of shards, and each of these shards can exist independently on other nodes. In our example, the index is made of four shards: shards 1 to 4 exist on Nodes A to D, respectively.
Node A then formulates the query request to send to other nodes, requesting them to carry out the search. Upon receiving the request, the respective node performs the search request on its shard. It then extracts the top set of results and responds back to the active coordinator with the results. The active coordinator then merges the data and sorts it before sending it to the client as a final result.
If the coordinator has a role as a data node, it will also dig into its own store to fetch the results. Not every node that receives the request is necessarily a data node. Similarly, not every node is expected to be part of the replication group for this search query.
These short articles are condensed excerpts taken from my book Elasticsearch in Action, Second Edition. The code is available in my GitHub repository.