Elasticsearch 5.5 with Spotinst Elastigroup

aws spot
ec2
elasticsearch
elastigroup

Elasticsearch is an open-source, broadly-distributable, readily-scalable, enterprise-grade search engine. Accessible through an extensive and elaborate API, Elasticsearch can power extremely fast searches that support your data discovery applications. When running Elasticsearch it can be easy for costs to escalate when you consider the amount of processing and memory that is required for an Elasticsearch node. Let’s walk through how you can run your nodes safely on EC2 Spot instances using the Spotinst Elastigroup service.

In this tutorial I setup an Elasticsearch 5.5 cluster consisting of a Master node and 3 fault tolerant Data nodes. The Data Nodes are being set up using our cluster management service, Elastigroup.

cloudcraft-3

 

Master Node

  • I have selected us-east-1 as the region for the tutorial. In your EC2 console spin up a new instance using the most recent Amazon Linux AMI and attach a role with AmazonEC2ReadOnlyAccess policy so that the script can get the Instance Metadata.
  • I have selected m3.xlarge as the instance type
  • Click on advanced details and enter the below Userdata script
#!/bin/bash
sudo yum remove java-1.7.0-openjdk -y
sudo yum install java-1.8.0-openjdk -y
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.rpm
sudo rpm -i elasticsearch-5.5.1.rpm
sudo chkconfig --add elasticsearch
PRIVATEIP="$(curl http://169.254.169.254/latest/meta-data/local-ipv4)"
INSTANCEID="$(curl http://169.254.169.254/latest/meta-data/instance-id)"
#ENTER the ELASTICSEARCH AWS REGION BELOW
export AWS_DEFAULT_REGION=us-east-1
if [ "$(aws ec2 describe-instances --instance-ids $INSTANCEID --query 'Reservations[*].Instances[*].[InstanceLifecycle]' --filters "Name=private-ip-address,Values=$PRIVATEIP" --output text)" = "spot" ]
then
 RACK="Spot"
else
 RACK="OD"
fi
sudo sh -c "echo 'ES_JAVA_OPTS=\"-Xms2g -Xmx2g\"' >> /etc/sysconfig/elasticsearch"
sudo sh -c "echo 'MAX_LOCKED_MEMORY=unlimited' >> /etc/sysconfig/elasticsearch"
#To create a dedicated Master-eligible Node
sudo echo "node.master: true
node.data: false
cluster.name : esonaws
bootstrap.memory_lock : true" >> /etc/elasticsearch/elasticsearch.yml
sudo sh -c "echo 'discovery.zen.ping.unicast.hosts : [\""$PRIVATEIP"\"]' >> /etc/elasticsearch/elasticsearch.yml"
sudo sh -c "echo 'network.host : [\"127.0.0.1\",\""$PRIVATEIP"\"]' >> /etc/elasticsearch/elasticsearch.yml"
sudo sh -c "echo 'node.attr.rack_id: "$RACK"' >> /etc/elasticsearch/elasticsearch.yml"
sudo sh -c "echo 'cluster.routing.allocation.awareness.attributes: rack_id' >> /etc/elasticsearch/elasticsearch.yml"
sudo service elasticsearch start

Note: Make sure that heap size is not more than half of the selected Instance type’s memory. The heap size also shouldn’t be more than 32 GB

The node above is a dedicated Master node to make it behave additionally as a data node set node.data: true

  • Create a Security group that will allow TCP 9200 and TCP 9300 for internal traffic and SSH for external management.

Note: Include internal traffic of all the AZ’s that you may select for Spot instances, later on in Elastigroup.

  • Launch your instance and SSH into the instance once it is up and running.
  • Use curl to make an API request to check the status of your new master. You should see a status of “green” as you can see below.
[ec2-user@ip-172-31-49-105 ~]$ curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "esonaws",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 0,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}
  • Make note of the Private IP of the master you just created, the IP would be used later for creating our Elasticsearch Data nodes.

 

Data Node

  • Create an Amazon Linux AMI (same as the Master) and attach an EBS data volume (/dev/sdb).
  • The User data script for data node is as shown below. This script will install Elasticsearch, mount your EBS data volume.

Note: Be sure to update your userdata script and add the mkfs, mount and the /etc/fstab lines based on the volume you attach to the Data node.

#!/bin/bash
#Update the below lines based on the volumes you attach to the Data node
sudo mkdir /media/elasticsearchvolume
sudo mkfs -t ext4 /dev/xvdb
sudo mount /dev/xvdb /media/elasticsearchvolume/
sudo sh -c "echo '/dev/xvdb /media/elasticsearchvolume ext4 defaults,nofail 0 0' >> /etc/fstab"
###################################################
sudo yum remove java-1.7.0-openjdk -y
sudo yum install java-1.8.0-openjdk -y
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.rpm
sudo rpm -i elasticsearch-5.5.1.rpm
sudo chown elasticsearch: /media/elasticsearchvolume
  • Create an Image of the Data node and note down the Image Id as this image will be used in Elastigroup.You can terminate this instance as we only need the Image of the Data node. The image has Elasticsearch 5.5 installed and data volumes mounted.

 

Creating the Elasticsearch Cluster in an Elastigroup

  • Open the Spotinst console and browse to Elastigroups. Click on the “Create” button to start the wizard.
  • Enter a name for your Elasticsearch Cluster and Choose the same region as the master node that you created earlier.
  • In this example, I change the Spot Instance Percentage to 75% and set the Capacity to 3 for Target, Minimum & Maximum. Thus the Elastigroup will launch 2 Spot and 1 On demand instance.
  • On the Compute page, select the same Region, VPC as your master. In Compte tab under Additional Configurations, attach the same IAM role as attached to the Master.

Note: Multi AZ’s makes the cluster fault tolerant but Latency and cost will be slightly more as data has to travel across AZ’s.

  • Under Compute tab of Elastigroup in the Launch Specification, I add the Image Id of the above data node. The instances launched from this image would have Elasticsearch installed and data volumes mounted automatically.
  • Under the same Compute tab, in Additional Configurations, enter the below Userdata script

Note: Be sure to update the Private IP of the master as noted above.

#!/bin/bash
PRIVATEIP="$(curl http://instance-data/latest/meta-data/local-ipv4)"
INSTANCEID="$(curl http://169.254.169.254/latest/meta-data/instance-id)"
#ENTER THE ELASTICSEARCH AWS REGION BELOW
export AWS_DEFAULT_REGION=us-east-1
if [ "$(aws ec2 describe-instances --instance-ids $INSTANCEID --query 'Reservations[*].Instances[*].[InstanceLifecycle]' --filters "Name=private-ip-address,Values=$PRIVATEIP" --output text)" = "spot" ]
then
 RACK="Spot"
else
 RACK="OD"
fi
sudo sh -c "echo 'ES_JAVA_OPTS=\"-Xms2g -Xmx2g\"' >> /etc/sysconfig/elasticsearch"
sudo sh -c "echo 'MAX_LOCKED_MEMORY=unlimited' >> /etc/sysconfig/elasticsearch"
#To create a dedicated Data Node
sudo echo "node.master: false
node.data: true
cluster.name : esonaws
bootstrap.memory_lock : true" >> /etc/elasticsearch/elasticsearch.yml
#You will need to type the Private IP of your Master node below
sudo sh -c "echo 'discovery.zen.ping.unicast.hosts : [\"Private IP of the Master node\"]' >> /etc/elasticsearch/elasticsearch.yml"
sudo sh -c "echo 'network.host : [\"127.0.0.1\",\""$PRIVATEIP"\"]' >> /etc/elasticsearch/elasticsearch.yml"
sudo sh -c "echo 'path.data : /media/elasticsearchvolume' >> /etc/elasticsearch/elasticsearch.yml"
sudo sh -c "echo 'node.attr.rack_id: "$RACK"' >> /etc/elasticsearch/elasticsearch.yml"
sudo chkconfig --add elasticsearch
sudo service elasticsearch start
  • In Compute tab, under Stateful select Persist Data Volumes. Under Keep each machines’ data volumes, select ReAttach the data volumes as shown below:

es_persist

  • It automatically re-attaches the data volumes to the new instances being spun up in case of Spot termination or failure of a Data node.

 

Running Elasticsearch

  • Now that we have installed and configured everything let’s make sure Elasticsearch is up and running and our new nodes are healthy. Run the same API request as we did earlier to check the status of our cluster. You should now see three Data nodes and a Master node.
    [ec2-user@ip-172-31-49-105 ~]$ curl localhost:9200/_cluster/health?pretty
    {
      "cluster_name" : "esonaws",
      "status" : "green",
      "timed_out" : false,
      "number_of_nodes" : 4,
      "number_of_data_nodes" : 3,
      "active_primary_shards" : 0,
      "active_shards" : 0,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 0,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 0,
      "active_shards_percent_as_number" : 100.0
    }
  • As you can see we do not yet have anything in our Elasticsearch data since we do not have any index. Let’s load a sample dataset :
curl -XPUT 'http://localhost:9200/example/test/1' -d '{ "user" : "test", "post_date" : "2017-08-08T14:12:12", "message" : "trying out Elastic Search"}'

 

After creating all the Indices, you have to enter the below API call to delay shard allocation by 10 minutes:

curl -XPUT 'localhost:9200/_all/_settings?pretty' -H 'Content-Type: application/json' -d'{

 "settings": {

   "index.unassigned.node_left.delayed_timeout": "10m"

 }

}'

This is done to delay the replica shard re-allocation by Master, for the node which is going to be spot terminated. The delay is so that the new node launched by the Elastigroup, would replace the terminated node and thus the new node can keep using the EBS data of the spot terminated node.

 

  • To display the information about the index:
curl 'localhost:9200/_cat/indices?v'
  • To display information about shards:
curl 'localhost:9200/_cat/shards?v'
[ec2-user@ip-172-31-49-105 ~]$ curl 'localhost:9200/_cat/indices?v'
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   example D-xZ42InSu6   5   1          1            0       10kb            5kb

[ec2-user@ip-172-31-49-105 ~]$ curl 'localhost:9200/_cat/shards?v'
index   shard prirep    state    docs store    ip          node
example 1     p        STARTED    0  162b  172.31.42.185 U1o2DGT
example 1     r        STARTED    0  162b  172.31.50.160 sCKdcif
example 3     p        STARTED    1  4.4kb 172.31.48.164 YogVx6H
example 3     r        STARTED    1  4.4kb 172.31.50.160 sCKdcif
example 4     p        STARTED    0  162b  172.31.42.185 U1o2DGT
example 4     r        STARTED    0  162b  172.31.50.160 sCKdcif
example 2     r        STARTED    0  162b  172.31.48.164 YogVx6H
example 2     p        STARTED    0  162b  172.31.50.160 sCKdcif
example 0     p        STARTED    0  162b  172.31.48.164 YogVx6H
example 0     r        STARTED    0  162b  172.31.50.160 sCKdcif
  • Great, now that we have some documents, let’s check on the number of shards again using check the health of cluster API call.
[ec2-user@ip-172-31-49-105 ~]$ curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "esonaws",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}
  • We now have some data loaded and we have 5 primary shards and 5 replicas bringing the total count to 10 active shards, which is the default for an Index.
    Since we have three data nodes let’s add an additional replica for our “example” index via API. It is recommended that you have at least 2 replicas in such cases as the shards have sufficient backup and failure of nodes won’t bring down the cluster entirely.
curl -XPUT 'localhost:9200/example/_settings' -d'{  "number_of_replicas": 2}'
  • As you can see below, we now have 5 primary shards and ten replicas bringing the total count to 15 active shards.
[ec2-user@ip-172-31-49-105 ~]$ curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "esonaws",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 5,
  "active_shards" : 15,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

[ec2-user@ip-172-31-49-105 ~]$ curl 'localhost:9200/_cat/shards?v'
index   shard prirep     state   docs store      ip       node
example 1     r        STARTED    0  162b  172.31.48.164 YogVx6H
example 1     p        STARTED    0  162b  172.31.42.185 U1o2DGT
example 1     r        STARTED    0  162b  172.31.50.160 sCKdcif
example 3     p        STARTED    1  4.4kb 172.31.48.164 YogVx6H
example 3     r        STARTED    1  4.4kb 172.31.42.185 U1o2DGT
example 3     r        STARTED    1  4.4kb 172.31.50.160 sCKdcif
example 4     r        STARTED    0  162b  172.31.48.164 YogVx6H
example 4     p        STARTED    0  162b  172.31.42.185 U1o2DGT
example 4     r        STARTED    0  162b  172.31.50.160 sCKdcif
example 2     r        STARTED    0  162b  172.31.48.164 YogVx6H
example 2     r        STARTED    0  162b  172.31.42.185 U1o2DGT
example 2     p        STARTED    0  162b  172.31.50.160 sCKdcif
example 0     p        STARTED    0  162b  172.31.48.164 YogVx6H
example 0     r        STARTED    0  162b  172.31.42.185 U1o2DGT
example 0     r        STARTED    0  162b  172.31.50.160 sCKdcif
  • We now have a fully redundant Elasticsearch cluster running. In case of any hardware failures or spot interruptions, the Elastigroup will automatically attach the existing EBS volume to the new instances.

 

Shard Allocation Awareness

[ec2-user@ip-172-31-49-105 ~]$ curl 'localhost:9200/_cat/shards?v'
index   shard prirep state     docs store     ip         node
example 1     p      STARTED    0  162b  172.31.42.185 U1o2DGT
example 1     r      STARTED    0  162b  172.31.50.160 sCKdcif
example 3     p      STARTED    1  4.4kb 172.31.48.164 YogVx6H
example 3     r      STARTED    1  4.4kb 172.31.50.160 sCKdcif
example 4     p      STARTED    0  162b  172.31.42.185 U1o2DGT
example 4     r      STARTED    0  162b  172.31.50.160 sCKdcif
example 2     r      STARTED    0  162b  172.31.48.164 YogVx6H
example 2     p      STARTED    0  162b  172.31.50.160 sCKdcif
example 0     p      STARTED    0  162b  172.31.48.164 YogVx6H
example 0     r      STARTED    0  162b  172.31.50.160 sCKdcif
  • As shown above (When there was only a single replica) the node from On-demand rack with IP 172.31.50.160 has a shard copy and the node/nodes from Spot rack with IP 172.31.42.185/ 172.31.48.164  has a shard copy.
  • Thus even if all of the nodes in the Spot rack get terminated or fail there is always a shard copy in the On-demand rack.

 

Failover Testing

  • Now let’s remove an instance from the cluster to simulate a spot interruption. Go into your Amazon Console and terminate a Spot instance.
  • If we run an API call to the cluster we can see that we have lost some of our replica shards due to the spot interruption.
[ec2-user@ip-172-31-49-105 ~]$ curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "esonaws",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 5,
  "delayed_unassigned_shards" : 5,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 66.66666666666666
}

[ec2-user@ip-172-31-49-105 ~]$ curl 'localhost:9200/_cat/shards?v'
index   shard prirep     state      docs store    ip          node
example 1     p        STARTED       0  162b 172.31.42.185  U1o2DGT
example 1     r        STARTED       0  162b 172.31.50.160  sCKdcif
example 1     r        UNASSIGNED                          
example 3     p        STARTED       1  4.4kb 172.31.42.185 U1o2DGT
example 3     r        STARTED       1  4.4kb 172.31.50.160 sCKdcif
example 3     r        UNASSIGNED                          
example 4     p        STARTED       0  162b 172.31.42.185  U1o2DGT
example 4     r        STARTED       0  162b 172.31.50.160  sCKdcif
example 4     r        UNASSIGNED                          
example 2     r        STARTED       0  162b 172.31.42.185  U1o2DGT
example 2     p        STARTED       0  162b 172.31.50.160  sCKdcif
example 2     r        UNASSIGNED                           
example 0     p        STARTED       0  162b 172.31.42.185  U1o2DGT
example 0     r        STARTED       0  162b 172.31.50.160  sCKdcif
example 0     r        UNASSIGNED
  • Now let’s wait for the replacement Spot instance to come live. The startup script that we defined in user data will configure the server automatically. Since we are using Persist Data Volumes, the data volume will automatically be attached to the new instance. Once the replacement instance is up and running, and the 10 minutes of delay allocation passes, we can query the API again to see the status of the cluster and shards.
[ec2-user@ip-172-31-49-105 ~]$ curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "esonaws",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 5,
  "active_shards" : 15,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

[ec2-user@ip-172-31-49-105 ~]$ curl 'localhost:9200/_cat/shards?v'
index   shard prirep      state   docs store    ip          node
example 1     r         STARTED    0  162b  172.31.49.235 YogVx6H
example 1     p         STARTED    0  162b  172.31.42.185 U1o2DGT
example 1     r         STARTED    0  162b  172.31.50.160 sCKdcif
example 3     r         STARTED    1  4.4kb 172.31.49.235 YogVx6H
example 3     p         STARTED    1  4.4kb 172.31.42.185 U1o2DGT
example 3     r         STARTED    1  4.4kb 172.31.50.160 sCKdcif
example 4     r         STARTED    0  162b  172.31.49.235 YogVx6H
example 4     p         STARTED    0  162b  172.31.42.185 U1o2DGT
example 4     r         STARTED    0  162b  172.31.50.160 sCKdcif
example 2     r         STARTED    0  162b  172.31.49.235 YogVx6H
example 2     r         STARTED    0  162b  172.31.42.185 U1o2DGT
example 2     p         STARTED    0  162b  172.31.50.160 sCKdcif
example 0     r         STARTED    0  162b  172.31.49.235 YogVx6H
example 0     p         STARTED    0  162b  172.31.42.185 U1o2DGT
example 0     r         STARTED    0  162b  172.31.50.160 sCKdcif
  • Now that the replacement spot instance is up and running, we can see that our Unassigned shard nodes are back up and running, thanks to our Persist Data Volume feature and the bootstrap configuration that we created in user data.

Finally, you have a fault tolerant Elasticsearch cluster with considerable cost savings due to spot instances.

  • Ranvijay Jamwal

    Hi,

    This seems to be a really nice post and just what we have been looking for.

    2 queries:

    In this setup I see there will be 1 master node out of the Elastigroup and the Data nodes will be in the Elastigroup right?

    1> How can I have multiple master nodes? Master Nodes failover?
    2> Can I add a new data node just by increasing the Elastigroup capacity? Will it automatically register with the master and increase the cluster size along with shard allocation?

    Thanks!

    • Karan Shetty

      Hi Ranvijay,

      Glad that you liked it.

      In the above tutorial, since there was only a Single Master node i didn’t include it in an Elastigroup.

      There is always a Single Active Master for a cluster but you can have Multiple Master eligible nodes.
      In case there are Multiple Master eligible nodes you can create the Data Node Elastigroup with Master eligible nodes.

      The Master eligible nodes must be odd number of nodes (N / 2 + 1) and should have the minimum_master_nodes as On demand instances.
      In case of minimum_master_nodes refer the below link, it is used to avoid split brain:
      https://www.elastic.co/guide/en/elasticsearch/reference/5.5/modules-node.html#split-brain

      Yes, you can add new data nodes by increasing the Elastigroup capacity. The nodes are automatically added to the Cluster based on the User data.