Applications AWS

etcd on AWS — User Guide

etcd on AWS — User Guide

cloudimg's etcd Amazon Machine Image delivers etcd 3.5 fully installed as a single-node deployment with Role-Based Access Control enabled at first boot. The image is headless: management is via the etcdctl CLI which is preinstalled at /usr/local/bin/etcdctl.

This guide covers everything you need: connecting to the instance, retrieving per-instance credentials, common etcdctl operations, the security model, and how to expand a single node into a 3- or 5-node Raft cluster.

Connecting to your instance

OS variant SSH login user
Ubuntu 24.04 ubuntu

Connect over SSH on port 22:

ssh -i <your-ssh-key.pem> ubuntu@<instance-public-ip>

Retrieving the per-instance credentials

On the first boot of every instance a one-shot service generates a fresh per-instance cloudimg user password and a separate emergency root password, writes them to a root-only file, and enables etcd RBAC. Retrieve them with:

sudo cat /root/etcd-credentials.txt

The file looks like:

# etcd 3.5 — Per-Instance Credentials
ETCD_VERSION=etcd Version: 3.5.30
ETCD_CLIENT_URL=http://<instance-public-ip>:2379
ETCD_HEALTH_URL=http://<instance-public-ip>:2379/health
ETCD_USER=cloudimg
ETCD_PASSWORD=<ETCD_PASSWORD>
ETCD_ROOT_PASSWORD=<ETCD_ROOT_PASSWORD>

The cloudimg user holds readwrite on the entire / key prefix and is the canonical user for application traffic. ETCD_ROOT_PASSWORD is documented as emergency-access only — do not use it for normal operations.

Confirming the service is healthy

sudo systemctl status etcd.service --no-pager --lines=5

endpoint health and endpoint status need authenticated etcdctl because RBAC is enforced — read the password from /root/etcd-credentials.txt for each call:

PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  endpoint health 2>/dev/null

endpoint status is a cluster-maintenance call that the cloudimg user is not granted — it needs the emergency root credential:

ROOT=$(sudo grep ^ETCD_ROOT_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=root:${ROOT} \
  endpoint status --write-out=table

The /health endpoint on port 2379 is anonymous by design — that is what Kubernetes liveness and readiness probes hit:

curl -s http://127.0.0.1:2379/health

Authenticated put and get

Every key-value operation needs --user=cloudimg:<password>. Load it from the credentials file in each call:

PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  put /apps/web/config '{"port":8080,"workers":4}'
PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  get /apps/web/config --print-value-only

List every key under a prefix:

PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  get /apps --prefix --keys-only

Delete a key:

PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  del /apps/web/config

Watching a key

etcdctl watch streams every change to a key or prefix. Open a second SSH session and put into /apps/... to see live updates stream in this one:

PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  watch /apps --prefix

Verifying RBAC is enforced

An unauthenticated put against the local endpoint is REJECTED — proof the RBAC wall is up. The command below is expected to FAIL with Error: etcdserver: user name is empty:

ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  put /noauth-attempt should-fail

Listing users is a cluster-management call which requires the root credential — use the emergency root password from the credentials file:

ROOT=$(sudo grep ^ETCD_ROOT_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=root:${ROOT} \
  user list

Inspect the cloudimg-rw role (the cloudimg user has permission to read roles assigned to it):

PASS=$(sudo grep ^ETCD_PASSWORD= /root/etcd-credentials.txt | cut -d= -f2-) && \
ETCDCTL_API=3 /usr/local/bin/etcdctl \
  --endpoints=http://127.0.0.1:2379 \
  --user=cloudimg:${PASS} \
  role get cloudimg-rw

Connecting from a remote client

The instance listens on 0.0.0.0:2379 and the advertise URL is published with the instance's public IPv4 address resolved from EC2 IMDSv2 at first boot. From any client with the security group opening port 2379:

ETCDCTL_API=3 etcdctl \
  --endpoints=http://<instance-public-ip>:2379 \
  --user=cloudimg:<ETCD_PASSWORD> \
  put mykey myvalue

For production traffic on a public client port we strongly recommend terminating TLS in front of etcd — see the TLS section below.

Security model

  • Peer port 2380 is loopback only. Single-node deployment never needs peer traffic on the network. Exposing 2380 to the network is a CVE-class mistake. Customers adding cluster peers must change this with awareness — see the multi-node section.
  • Client port 2379 listens on 0.0.0.0. Security is enforced by RBAC at the etcd layer plus the AWS Security Group at the network layer. Keep your Security Group locked to the CIDR ranges that need to talk to etcd — do not open 2379 to 0.0.0.0/0 in production.
  • RBAC is enabled at first boot. A cloudimg user with readwrite on the / prefix, plus a separate emergency root user, are created with per-instance passwords. The cluster refuses every unauthenticated KV operation. The /health endpoint stays anonymous so Kubernetes probes work.

Expanding to a multi-node cluster

A single-node etcd is fine for development, proof-of-concept and small production workloads where the underlying EC2 instance's durability is acceptable. For a quorum-replicated cluster, etcd recommends 3 or 5 nodes.

Cluster expansion is a runtime operation — you launch additional instances of this AMI, then use etcdctl member add from the existing node to introduce each peer to the cluster. Each new peer needs its ETCD_LISTEN_PEER_URLS rewritten away from 127.0.0.1 to the peer's network address, the existing cloudimg-etcd-token reused, and ETCD_INITIAL_CLUSTER_STATE=existing set on the peer before it starts. The detailed multi-node guide is in the etcd upstream documentation at https://etcd.io/docs/ — the cloudimg recipe is fully compatible.

Enabling TLS

The image ships with http:// on the client port. To enable TLS:

  1. Provision a server certificate and private key (e.g. via your own internal CA or a public CA like Let's Encrypt with a DNS challenge).
  2. Drop them at /etc/etcd/server.crt and /etc/etcd/server.key (mode 0640, group etcd).
  3. Edit /etc/etcd/etcd.conf and add ETCD_CERT_FILE + ETCD_KEY_FILE lines plus change ETCD_LISTEN_CLIENT_URLS and ETCD_ADVERTISE_CLIENT_URLS to use https://.
  4. Restart: sudo systemctl restart etcd.service.

Customer clients now point at https://<your-public-host>:2379 and pass --cacert=<your-CA-bundle.pem> to etcdctl.

Updating

The image ships with etcd 3.5.30 from the official etcd-io GitHub release. Future versions in the etcd 3.5 line are drop-in compatible — download the new tarball from https://github.com/etcd-io/etcd/releases, verify the SHA-256, stop etcd.service, replace /usr/local/bin/etcd and /usr/local/bin/etcdctl, and restart.

The systemd unit, the firstboot scripts and the credentials file are all preserved across binary upgrades.

Screenshots

etcd version and service status

etcd per-instance credentials

etcdctl endpoint health and operations

Support

24/7 technical support is included with this AMI. Email support@cloudimg.co.uk or use the chat widget on https://www.cloudimg.co.uk/ for help with deployment, RBAC design, cluster expansion, performance tuning, or anything else etcd-related.