Databases AWS

ClickHouse on AWS User Guide

| Product: ClickHouse on AWS

Overview

This image runs ClickHouse, the open source column oriented analytics database, on a single node. ClickHouse is built for very fast queries over very large amounts of data and is widely used for real time analytics, log and event analysis, business intelligence, observability and large scale reporting. The image installs ClickHouse from the official ClickHouse package repository and pins to the stable release line.

ClickHouse exposes three network services on the running node:

  • The HTTP interface on port 8123, which serves the SQL endpoint, the /ping health probe and the built in /play browser SQL playground
  • The native TCP protocol on port 9000, which the bundled clickhouse-client command line shell and the official drivers use
  • The inter server replication protocol on port 9009, used between ClickHouse nodes in a multi node cluster

The default user password is generated on the first boot of every deployed instance. Two instances launched from the same Amazon Machine Image never share a password. The image ships with a placeholder hash and a one shot first boot service rotates that password to a fresh per instance value, applies it inside ClickHouse with ALTER USER, and writes the plaintext to /root/clickhouse-credentials.txt with mode 0600 so that only the root user can read it.

ClickHouse's data, metadata and system tables live on a dedicated EBS data volume mounted at /var/lib/clickhouse, separate from the operating system disk, so the data tier can be resized independently of the root volume.

Prerequisites

Before you deploy this image you need:

  • An Amazon Web Services account where you can launch EC2 instances
  • IAM permissions to launch instances, create security groups, and subscribe to AWS Marketplace products
  • An EC2 key pair in the target Region for SSH access to the instance
  • A VPC and subnet in the target Region, with a security group allowing inbound port 22 from your management network and inbound port 8123 and 9000 from the trusted networks that host the applications which will talk to ClickHouse
  • The AWS CLI (version 2) installed locally if you plan to deploy from the command line

Step 1: Launch the Instance from the AWS Marketplace

Sign in to the AWS Management Console, open the EC2 service, and select Launch instance. Under Application and OS Images choose AWS Marketplace AMIs and search for ClickHouse. Select the cloudimg listing and choose Select, then Continue on the subscription summary.

Pick an instance type of m5.large or larger. Choose your EC2 key pair under Key pair (login). Under Network settings select your VPC and subnet, and either create or select a security group that allows inbound port 22 from your management network and inbound ports 8123 and 9000 from the trusted networks that will reach ClickHouse. Do not open the ClickHouse ports to the public internet, because the /play SQL playground is served on the HTTP port and a public SQL endpoint is a serious security exposure. Leave the root volume at the default size or larger.

Select Launch instance. First boot initialisation takes approximately one minute after the instance state becomes Running and the status checks pass.

Step 2: Launch the Instance from the AWS CLI

The following block launches an instance from the cloudimg ClickHouse Marketplace AMI into an existing subnet and security group. Replace <ami-id> with the AMI ID shown on the Marketplace listing, <key-name> with your EC2 key pair name, <subnet-id> with your subnet ID, and <security-group-id> with a security group that opens ports 22, 8123 and 9000 as described above.

aws ec2 run-instances \
  --image-id <ami-id> \
  --instance-type m5.large \
  --key-name <key-name> \
  --subnet-id <subnet-id> \
  --security-group-ids <security-group-id> \
  --block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":30,"VolumeType":"gp3"}}]' \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=clickhouse-01}]'

The command prints a JSON document on success. Note the instance ID, then retrieve its public address once it is running with aws ec2 describe-instances --instance-ids <instance-id> --query "Reservations[].Instances[].PublicIpAddress" --output text.

Step 3: Connect and Retrieve the Default User Password

Connect over SSH with the key pair you selected and the public IP address from step 2. The SSH login user depends on the operating system of the AMI variant you launched:

AMI variant SSH login user
ClickHouse on Ubuntu 24.04 ubuntu

The first boot service runs before the SSH daemon is ready, so the credentials file is always in place when you log in for the first time.

sudo cat /root/clickhouse-credentials.txt

You will see a plain text file containing the ClickHouse username (default), the generated default user password, and the HTTP and native ports:

# ClickHouse — Per-Instance Credentials
# Generated on first boot: 2026-05-22 23:04:48 UTC
#
# Open the shell with:
#   clickhouse-client --password '<password below>'
#
CLICKHOUSE_USER=default
CLICKHOUSE_PASSWORD=<CLICKHOUSE_PASSWORD>
CLICKHOUSE_HTTP_PORT=8123
CLICKHOUSE_NATIVE_PORT=9000

Copy these values somewhere secure such as a password manager or an encrypted vault, and do not commit them to source control. Each command block in this guide that talks to ClickHouse begins by reading the default user password from the credentials file into a PASSWORD shell variable, so every block is self contained:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
echo "default user password length: ${#PASSWORD}"

Step 4: Verify the Server is Healthy

ClickHouse exposes a dedicated health probe on the HTTP port. /ping returns the literal string Ok. and does not require authentication, so it is the right endpoint for an external load balancer or a Kubernetes liveness probe:

curl -s http://127.0.0.1:8123/ping

The response is:

Ok.

Run a quick SELECT version() over HTTP with HTTP Basic authentication to confirm the data plane is healthy too:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
curl -s -u "default:$PASSWORD" "http://127.0.0.1:8123/?query=SELECT+version()"

The response is the installed ClickHouse version:

26.5.1.882

And confirm the systemd unit is active:

sudo systemctl is-active clickhouse-server

The response is active.

Step 5: Server Components

Component Version Source
ClickHouse server 26.5.1.882 packages.clickhouse.com/deb stable main
clickhouse-client (CLI) bundled with the server packages.clickhouse.com/deb stable main
Default user authentication SHA256 hashed password, rotated on first boot cloudimg image
Data directory /var/lib/clickhouse (dedicated EBS volume) cloudimg image
HTTP port 8123 bound on 0.0.0.0
Native TCP port 9000 bound on 0.0.0.0
Inter server replication port 9009 bound on 0.0.0.0

Step 6: Use the clickhouse-client Command Line Shell

clickhouse-client is the bundled command line shell. Open it interactively to run queries against the local node:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
clickhouse-client --password "$PASSWORD"

Or pass a single query with --query and stay on the shell prompt:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
clickhouse-client --password "$PASSWORD" --query "SELECT version()"

The response is the installed ClickHouse version:

26.5.1.882

Step 7: Use the /play Browser SQL Playground

ClickHouse ships with a minimal browser SQL playground at /play on the HTTP port. It is the fastest way to explore the server interactively. Restrict port 8123 to trusted networks before you use it — the playground is open to anyone who can reach the HTTP port and is not suitable for the public internet.

Browse to http://<instance-public-ip>:8123/play and the playground loads.

ClickHouse /play empty playground

Type the default user name in the user field and the per instance password from /root/clickhouse-credentials.txt in the password field, type a query into the editor, and click Run (or press Ctrl+Enter). A SELECT version(), hostName(), uptime() shows the running ClickHouse build, the host name and uptime in seconds:

ClickHouse /play server info query

system.tables lists every table the server knows about. A query against system.tables filtered to the system database, ordered by total_rows, shows the size of the server's own internal tables:

ClickHouse /play system tables analytics

The result panel reports the number of rows returned and the query time on the right.

Step 8: Create a Database, a MergeTree Table, and Run Analytics

ClickHouse's flagship table engine is MergeTree, the columnar engine that gives ClickHouse its speed. The following block creates a demo database, a small MergeTree table of (timestamp, user_id, action) event rows, inserts three rows, and runs two analytics queries:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
clickhouse-client --password "$PASSWORD" --query "CREATE DATABASE IF NOT EXISTS demo" </dev/null
clickhouse-client --password "$PASSWORD" --query "CREATE TABLE IF NOT EXISTS demo.events (ts DateTime, user_id UInt32, action String) ENGINE=MergeTree ORDER BY ts" </dev/null
clickhouse-client --password "$PASSWORD" --query "INSERT INTO demo.events VALUES (now(), 1, 'login'), (now(), 2, 'click'), (now(), 3, 'purchase')" </dev/null
clickhouse-client --password "$PASSWORD" --query "SELECT count() AS total FROM demo.events" </dev/null
clickhouse-client --password "$PASSWORD" --query "SELECT action, count() AS n FROM demo.events GROUP BY action ORDER BY action" </dev/null

The total row count is:

3

The group by reports one row per distinct action:

click   1
login   1
purchase    1

When you are finished with the demo database, drop it:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
clickhouse-client --password "$PASSWORD" --query "DROP DATABASE demo" </dev/null
clickhouse-client --password "$PASSWORD" --query "SHOW DATABASES" </dev/null

SHOW DATABASES reports the system schemas only:

INFORMATION_SCHEMA
default
information_schema
system

Step 9: HTTP Interface for Drivers and Integrations

Every ClickHouse driver and integration speaks one of two protocols: the native TCP protocol on port 9000 or the HTTP interface on port 8123. The HTTP interface is the simplest to use from a script or a non Linux client because it requires nothing more than an HTTP library. The same SELECT over HTTP looks like:

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
curl -s -u "default:$PASSWORD" "http://127.0.0.1:8123/?query=SELECT+count(*)+FROM+system.tables"

The response is the number of tables the server knows about. The HTTP interface supports every ClickHouse format (TabSeparated, JSON, CSV, Pretty, etc.) and is the right choice for tools, dashboards and language drivers that prefer HTTP.

Step 10: Security Posture

The most important security action after launch is to restrict network access. Open ports 8123 and 9000 only to your application tier and your management network. The /play browser SQL playground and the HTTP SQL endpoint are both served on port 8123, so a public 8123 means anyone can run SQL against your data once they have credentials. The per instance default user password reduces but does not eliminate the risk.

To rotate the default user password yourself, generate a new password and apply it inside ClickHouse with ALTER USER:

NEW_PASSWORD="$(openssl rand -hex 16)"
OLD_PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
clickhouse-client --password "$OLD_PASSWORD" --query "ALTER USER default IDENTIFIED WITH sha256_password BY '$NEW_PASSWORD'" </dev/null
echo "New password: $NEW_PASSWORD"

Then update /root/clickhouse-credentials.txt so your scripts pick up the new value.

To create a non administrator user for an application, use SQL managed access control (access_management is enabled on the default user):

PASSWORD="$(sudo awk -F= '/^CLICKHOUSE_PASSWORD=/ {print $2}' /root/clickhouse-credentials.txt)"
clickhouse-client --password "$PASSWORD" --query "CREATE USER analytics IDENTIFIED WITH sha256_password BY '<APP_PASSWORD>'" </dev/null
clickhouse-client --password "$PASSWORD" --query "GRANT SELECT ON demo.* TO analytics" </dev/null

Replace the <APP_PASSWORD> placeholder with the password you generate for the application user.

Step 11: HTTPS with a Reverse Proxy

The image serves the HTTP interface over plain HTTP on port 8123. For a production deployment, put nginx, Caddy or an Application Load Balancer in front of ClickHouse and terminate TLS at the proxy. The proxy forwards to 127.0.0.1:8123 and the ClickHouse port is closed at the security group. nginx and Caddy can also enforce client certificate authentication for an extra layer of protection on the HTTP endpoint.

Step 12: Backup and Maintenance

ClickHouse's data lives on the dedicated EBS volume mounted at /var/lib/clickhouse. The most cost effective backup posture is EBS snapshots of that volume, scheduled with Amazon Data Lifecycle Manager or AWS Backup:

df -h /var/lib/clickhouse

The output reports the dedicated data volume and its used and available space:

/dev/nvme1n1     30G   17M   28G   1% /var/lib/clickhouse

For logical backups, use the BACKUP TABLE ... TO Disk('backup', '<path>') and BACKUP DATABASE family of statements (see the ClickHouse backups documentation for the full syntax and supported destinations including S3).

To upgrade ClickHouse to the latest release on the stable line, run:

sudo apt-get update
sudo apt-get install --only-upgrade -y clickhouse-server clickhouse-client
sudo systemctl restart clickhouse-server

Step 13: Scaling Out to a Multi Node Cluster

ClickHouse scales horizontally by sharding tables across multiple servers and using replication for fault tolerance. The single node image is the starting point; for a multi node deployment:

  1. Launch additional cloudimg ClickHouse instances in the same VPC
  2. Open inter server replication port 9009 between the nodes in the security group
  3. Configure remote_servers, <macros> (shard/replica) and ZooKeeper or ClickHouse Keeper in /etc/clickhouse-server/config.d/
  4. Create ReplicatedMergeTree tables that ClickHouse keeps in sync across the cluster

The ClickHouse architecture documentation covers the canonical sharded + replicated topology in detail.


Screenshots

ClickHouse /play SQL playground

The built in /play browser SQL playground served on port 8123, running a SELECT against the running ClickHouse node.

clickhouse-client query session

A clickhouse-client session creating a database, inserting rows into a MergeTree table and reading them back on the running ClickHouse node.

clickhouse-server version

The clickhouse-server reporting its version and the systemd unit clickhouse-server.service in an active running state.


Support

cloudimg provides 24/7 technical support for this image. Contact us by email for:

  • Deployment and configuration help
  • Schema design and query tuning
  • Multi node cluster setup, sharding and replication
  • Performance tuning, hardware sizing and capacity planning
  • Backup and recovery design
  • Database administration and upgrades

License and Trademarks

This image redistributes ClickHouse under its own open source license. cloudimg makes no claim to any ClickHouse trademark; the name and logo of ClickHouse remain the property of their respective owners. The cloudimg image bundles the upstream open source ClickHouse packages without modification and adds first boot credential generation, a dedicated EBS data volume, and a production ready default configuration.