Data Analytics AWS

Elasticsearch on AWS User Guide

| Product: Elasticsearch on AWS

Overview

This image runs Elasticsearch 8.x, the open source distributed search and analytics engine, fully installed and configured from the official Elastic 8.x APT repository. The Elasticsearch daemon runs as a single node and binds the REST API to TCP 9200 with HTTP basic authentication enforced by the xpack security subsystem. The cluster transport listens on TCP 9300 on the loopback interface only, because this image is single node by default. The OpenJDK runtime is bundled inside the Elasticsearch package, so the image carries no separate Java install and no system level JDK to upgrade.

The Elasticsearch data directory (path.data) is on a separate, independently resizable EBS volume mounted at /var/lib/elasticsearch, so index data is kept off the operating system disk and can be grown without disturbing the rest of the instance.

On the first boot of every deployed instance a one shot service generates a fresh password for the elastic superuser, unique to that instance, applies it to the Elasticsearch security index, and writes the plain text value to /root/elasticsearch-credentials.txt (mode 0600, readable only by root). No shared or default credentials ship in the image and the REST API is never exposed without authentication.

This is the standalone Elasticsearch listing. Kibana and Logstash are not included; the full ELK stack is available as a separate listing. Keeping Elasticsearch standalone leaves the full resources of the instance available to the search workload and matches the upstream documentation for production deployments.

The brand is lowercase cloudimg throughout this guide. All cloudimg URLs in this guide use the form https://www.cloudimg.co.uk.

Screenshots

Elasticsearch version and service status

Elasticsearch cluster health

Elasticsearch nodes and indices

Prerequisites

Before you deploy this image you need:

  • An Amazon Web Services account where you can launch EC2 instances
  • IAM permissions to launch instances, create security groups and subscribe to AWS Marketplace products
  • An EC2 key pair in the target Region for SSH access to the instance
  • A VPC and subnet in the target Region, with a security group allowing inbound TCP 22 from your management network and inbound TCP 9200 only from the application servers and management hosts that need it
  • The AWS CLI version 2 installed locally if you plan to deploy from the command line

Step 1: Launch the Instance from the AWS Marketplace

Sign in to the AWS Management Console, open the EC2 service, and select Launch instance. Under Application and OS Images choose AWS Marketplace AMIs and search for Elasticsearch. Select the cloudimg listing and choose Select, then Continue on the subscription summary.

Pick an instance type of m5.large or larger as a balanced default. Elasticsearch is a JVM workload that benefits from heap of roughly 50 percent of host memory, but never above approximately 30 GiB because beyond that the JVM stops using compressed object pointers and per object memory overhead increases. For light search workloads m5.large (8 GiB) is sufficient; for heavier ingest or analytics workloads use m5.xlarge, m5.2xlarge or larger.

Choose your EC2 key pair under Key pair (login). Under Network settings select your VPC and subnet, and either create or select a security group that allows inbound TCP 22 from your management network and inbound TCP 9200 only from the application servers that need it. Do not expose port 9200 to the public internet; once authenticated the REST API exposes administrative operations including index deletion and cluster reconfiguration.

Leave the root volume at the default size. The image attaches a dedicated 30 GiB data volume automatically, mounted at /var/lib/elasticsearch, for index storage. Select Launch instance. First boot initialisation takes up to a minute after the instance state becomes Running and the status checks pass; that is the time Elasticsearch takes to start the JVM, initialise the security index, and the firstboot service takes to generate the elastic superuser password.

Step 2: Launch the Instance from the AWS CLI

The following block launches an instance from the cloudimg Elasticsearch Marketplace AMI into an existing subnet and security group. Replace <ami-id> with the AMI ID shown on the Marketplace listing, <key-name> with your EC2 key pair name, <subnet-id> with your subnet ID, and <security-group-id> with a security group that allows inbound TCP 22 from your management network and inbound TCP 9200 from your application servers.

aws ec2 run-instances \
  --image-id <ami-id> \
  --instance-type m5.large \
  --key-name <key-name> \
  --subnet-id <subnet-id> \
  --security-group-ids <security-group-id> \
  --block-device-mappings '{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":30,"VolumeType":"gp3"}}' \
  --metadata-options 'HttpTokens=required,HttpEndpoint=enabled' \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=elasticsearch-node-1}]'

The image attaches the dedicated 30 GiB index storage volume automatically. To start with more index storage, enlarge that volume on the Storage step in the console, or add a second block device mapping on the CLI.

Step 3: Connect to the Instance over SSH

Connect to the instance with SSH as the default login user for the operating system variant you launched. The login user differs by variant:

Operating system variant SSH login user
Ubuntu 24.04 ubuntu

Replace <key-file> with the path to your private key file and <instance-public-ip> with the public IP address or DNS name of the instance.

ssh -i <key-file> ubuntu@<instance-public-ip>

Step 4: Retrieve the elastic Superuser Password

The elastic superuser password is generated on the instance's first boot and written to /root/elasticsearch-credentials.txt, which is readable only by root. Print it from inside the SSH session:

sudo cat /root/elasticsearch-credentials.txt

The file records the URL the API is reachable on, the superuser name, and the plain text password. Save the password somewhere safe; it is shown in plain text only here, and rotating it through the supported tool overwrites the value.

Step 5: Confirm the Elasticsearch Service

Check that the Elasticsearch service is active and listening on its REST API port. The first command reports active for both the daemon and the cloudimg first boot service; the second confirms a process is listening on TCP 9200.

systemctl is-active elasticsearch.service elasticsearch-firstboot.service
ss -tlnp | grep ':9200 '

The Elasticsearch process is run by the elasticsearch system user. The bundled OpenJDK runtime is found at /usr/share/elasticsearch/jdk and is the only Java on the image.

Step 6: Talk to the REST API

Read the password into a shell variable, then query the root endpoint. The elastic superuser is authenticated over HTTP basic auth. The root endpoint reports the node name, the cluster name, the version and the Lucene library version.

PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" http://127.0.0.1:9200/

Inspect cluster health. On a single node image the status is green or yellow; green means every primary shard and its replica is active, yellow means one or more replicas are unassigned because the cluster has only one node and a replica cannot be assigned to the same node as its primary.

PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" http://127.0.0.1:9200/_cluster/health | python3 -m json.tool

Step 7: Create an Index, Index a Document, Search and Delete

Create a new index, index a document into it, search for it, then delete the index. This is the canonical end to end check that the cluster is accepting writes, refreshing the search index and answering queries.

PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" -X PUT http://127.0.0.1:9200/books
PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" -X POST "http://127.0.0.1:9200/books/_doc?refresh=true" \
    -H 'Content-Type: application/json' \
    -d '{"title":"The Pragmatic Programmer","author":"Hunt and Thomas","year":1999}'
PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" "http://127.0.0.1:9200/books/_search?q=author:Hunt"
PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" -X DELETE http://127.0.0.1:9200/books

List every index on the cluster and review storage and document counts with the _cat indices endpoint. The trailing ?v adds a header row.

PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" "http://127.0.0.1:9200/_cat/indices?v"

Step 8: Rotate the elastic Password

Rotate the elastic superuser password with the supported reset tool. The flag -i prompts for the new password interactively; -b generates a strong random one and prints it on stdout. After rotation, update /root/elasticsearch-credentials.txt so future shells pick up the new value.

sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic -i

Use the prompt variant in interactive sessions only. The non interactive variant suits automation:

sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic -b

Step 9: Tune the JVM Heap

The image ships with a conservative heap of -Xms512m -Xmx1g so the daemon starts cleanly on the smallest build instance type. On a production instance, raise the heap to roughly 50 percent of host memory but never above approximately 30 GiB. The heap is set in /etc/elasticsearch/jvm.options.d/cloudimg-heap.options.

cat /etc/elasticsearch/jvm.options.d/cloudimg-heap.options

To change it on a running instance, edit the file and restart Elasticsearch:

sudo sed -i 's/^-Xms.*/-Xms4g/' /etc/elasticsearch/jvm.options.d/cloudimg-heap.options
sudo sed -i 's/^-Xmx.*/-Xmx4g/' /etc/elasticsearch/jvm.options.d/cloudimg-heap.options
sudo systemctl restart elasticsearch.service

Step 10: Storage and Index Data Volume

The Elasticsearch data directory is its own filesystem on its own EBS volume, mounted at /var/lib/elasticsearch. Index data lives there. Confirm the mount and review free space with:

findmnt /var/lib/elasticsearch
df -h /var/lib/elasticsearch

To grow the index store, modify the EBS volume in the AWS console or with the CLI, then extend the filesystem on the instance with resize2fs against the volume's device. The data volume is independent from the operating system disk, so it can be grown without disturbing the rest of the instance.

Step 11: Snapshot Backups

Elasticsearch snapshots write a logical backup to a registered repository. A common repository type is s3, which requires the AWS S3 repository plugin. The plugin is available from the official Elastic packages; install it with the bundled plugin tool, then restart Elasticsearch and register the repository over the REST API.

sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install --batch repository-s3
sudo systemctl restart elasticsearch.service

Register a repository and take a snapshot. Replace <your-bucket> and <your-base-path> with the S3 bucket and key prefix you want the snapshots written to, and ensure the instance's IAM role can read and write that bucket.

PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" -X PUT "http://127.0.0.1:9200/_snapshot/s3-backups" \
    -H 'Content-Type: application/json' \
    -d '{"type":"s3","settings":{"bucket":"<your-bucket>","base_path":"<your-base-path>"}}'

curl -s -u "elastic:${PASS}" -X PUT "http://127.0.0.1:9200/_snapshot/s3-backups/snap-1?wait_for_completion=true"

Step 12: Enable HTTPS on the REST API

The image disables HTTP TLS by default so the daemon is reachable with plain HTTP and basic authentication out of the box. For production traffic, enable HTTPS with your own certificate authority. Place a PEM certificate and private key on the instance, then point Elasticsearch at them through elasticsearch.yml. The reference settings, with placeholders for your own files, are below; restart Elasticsearch after editing.

sudo tee -a /etc/elasticsearch/elasticsearch.yml > /dev/null <<EOF

# cloudimg: HTTPS on the REST layer with your own certificate authority.
# Replace the paths below with your PEM certificate and private key.
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.certificate: /etc/elasticsearch/certs/node.crt
xpack.security.http.ssl.key: /etc/elasticsearch/certs/node.key
xpack.security.http.ssl.certificate_authorities: ["/etc/elasticsearch/certs/ca.crt"]
EOF
sudo systemctl restart elasticsearch.service

After enabling HTTPS, every client must use https:// and a trusted certificate. The user guide for cloudimg ELK Stack covers the full TLS bootstrap in more depth.

Step 13: Maintenance

Keep the operating system patched with the standard package manager. To upgrade Elasticsearch itself, the official Elastic 8.x APT repository is already configured on the image, so engine upgrades are delivered through the normal package update process.

Inspect the JVM and node level statistics to spot saturated heaps, full disks or pending tasks:

PASS=$(sudo grep '^elasticsearch.elastic.pass=' /root/elasticsearch-credentials.txt | cut -d= -f2-)
curl -s -u "elastic:${PASS}" "http://127.0.0.1:9200/_cat/nodes?v&h=name,heap.percent,ram.percent,cpu,load_1m,node.role"

To free disk space on a busy cluster, delete unused indices with DELETE /<index-name>, or use index lifecycle management policies to expire indices automatically. The cluster log lives at /var/log/elasticsearch/elasticsearch.log and is rotated by the system logrotate.

Support

This Amazon Machine Image is provided by cloudimg with 24/7 technical support by email and chat. Contact cloudimg for help with Elasticsearch deployment, index design, ingest pipelines, performance tuning, snapshots and upgrades.

Elasticsearch and the Elasticsearch logo are trademarks or registered trademarks of Elasticsearch B.V. or its affiliates. All other product and company names are trademarks or registered trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.