Tag : elasticsearch-performance
Tag : elasticsearch-performance
Following are some points could help to improve elasticsearch performance and it can scale better. Here is Ideal cluster infrastructure based on my research.
Single point of url for search and index:- It would be good if we can keep single url for searching and indexing. It can be done by having load balancer tool. Behind load balancer configure backend nodes as master and non-data nodes. Reason behind not keeping data nodes as a backend for load balancer is, to avoid un-wanted http requests on data nodes. It will keep data nodes away from serving http request which are coming for searching and indexing data.
So, data node can easily able to search from shards or creates index based wrt. request.
Master Node:- For stability and best performance of elasticsearch cluster and based on elasticsearch recommendation, we should keep a spate node as master node. It can be done by making “data=false” and “master=true” in config file.
All other nodes should look for this master node by setting up following config properties…
Keep couple of non-master and non-data nodes for serving http requests. That will also help if master node goes down.
Data Nodes: – Data nodes are specially meant for searching request from shards and sending result back and creating new data index on cluster. So with respect to master node and non-data nodes, these would require more RAM and processing power.
As data node holds data, so we should keep disk size as per our data volume requirement.
It’s very important question is how much data node should I keep?
Answer: – Currently I can tell if you are having less than 500GB of data volume and you are having 5 numbers of shards per index with good amount of search and indexing requests. You must need to have 5 data nodes for balancing performance. If you keep 3 or 4 data nodes then 2 or 1 data nodes would allocate 2 shards respectively. Shard distribution will not be proportionate. It leads to ….
If data size is less than 150GB, it will not matter much.Note (I have tested these on centos 8 core machine with 16 GB of RAM; I will come up with actual numbers and stats in Part-II)
Memory management :- Currently keep it simple like 50% for JVM heap and 50% for ES
Elasticsearch config properties :-
Monitoring tools :-
Following are good monitoring tool I liked and they are very helpful.
1) Elastic-hq – http://www.elastichq.org/
2) Elastic head – https://github.com/mobz/elasticsearch-head
Coming soon … Part – II
I have faced major issue in elasticsearch, in my cluster after some time elasticsearch automatically enables more than one node as master nodes. Due to that, it was showing two set of nodes in one cluster. It affects following …Watch movie online The Transporter Refueled (2015)
You might face same issue and its normal, may be it is due to …
Its very easy to fix this issue. Master node maintain a cluster, and requests indexing or search to data nodes and Data node stores data. When it receives a request from a client, it searches data from shards or creates an index. If we have asked a node to do both job its become difficult to manage it. So master node has to maintain cluster as well as do search and index data. It cause performance issue. Best solution is keep them separate. In your cluster you should keep one master node only and configure all nodes to look same master for cluster state. For failover you might keep one extra master node as disaster recovery.
Following are the setting to do that.
node.master: true node.data: false transport.tcp.compress: true discovery.zen.minimum_master_nodes: 1 discovery.zen.ping.timeout: 15s discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["master node"]
node.master: false node.data: true transport.tcp.compress: true discovery.zen.minimum_master_nodes: 1 discovery.zen.ping.timeout: 15s discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["master node"]
Change all nodes config likewise and restart all nodes.