If you’re using the Elasticsearch query functionality, for mainly front-facing client search, there are 3 important metrics to monitor performance.
Your cluster can be putting up with any number of queries at a time. The volume of queries over time will align roughly to the load of requests laying a potential burden. Unexpected peaks and valley in a time series of query load could be signs of a problem or potential optimization opportunities.
The average query latency, measured as the total count of queries and the total time over regular intervals, will alert you to how your available resources are performing under your set conditions. Establish a ceiling where if query latency breaches a particular max, there could be resource strain or opportunity for optimization.
As the second part of Elasticsearch’s search process, fetch follows the query step to deliver the requested data. Fetch latency should be considerably lower than your query latency. Normal behavior would be indicated by level constant fetch latency. Should fetch latency begin to rise, there’s likely issues developing within your resources.