Applications AWS

TensorFlow Serving 2 on AWS User Guide

| Product: TensorFlow Serving 2

TensorFlow Serving 2 on AWS User Guide

TensorFlow Serving 2 is Google's flexible, high-performance serving system for machine learning models. This image delivers TF Serving 2 running under Docker, with the canonical half_plus_two sample SavedModel preloaded on a dedicated 20 GiB model volume and an nginx basic-auth gateway protecting the REST API on port 80.

Connecting to your instance

OS variant Default login user Connect command
Ubuntu 24.04 ubuntu ssh -i your-key.pem ubuntu@<instance-public-ip>

First boot

On first boot a one-shot systemd service (tfserving-firstboot.service) generates a fresh per-instance nginx password and writes it to /root/tensorflow-serving-credentials.txt. The TensorFlow Serving container and nginx then start automatically once firstboot completes.

Retrieve your credentials:

sudo cat /root/tensorflow-serving-credentials.txt

Example output:

# TensorFlow Serving 2 -- Per-VM Credentials
# Generated: Wed May 27 22:15:49 UTC 2026
TFSERVING_VERSION=2.19.1
NGINX_USER=cloudimg
password=5802da51dc7fd2e3616d5a8072c2134feeab80f0eb3137f0
TFSERVING_REST_URL=http://32.198.94.22:8501
TFSERVING_GRPC_URL=32.198.94.22:8500
TFSERVING_REST_GATED_URL=http://32.198.94.22/v1
TFSERVING_SAMPLE_MODEL=half_plus_two

Verifying the service

Check that the systemd unit and Docker container are running:

systemctl status tfserving.service

Example output:

* tfserving.service - TensorFlow Serving 2 Model Server (cloudimg)
     Loaded: loaded (/etc/systemd/system/tfserving.service; enabled; preset: enabled)
     Active: active (exited) since Wed 2026-05-27 22:15:52 UTC; 18s ago
    Process: 33573 ExecStart=/usr/bin/docker compose -f /opt/tfserving/docker-compose.yml up -d (code=exited, status=0/SUCCESS)
   Main PID: 33573 (code=exited, status=0/SUCCESS)
        CPU: 99ms

May 27 22:15:52 ip-172-31-85-204 systemd[1]: Starting tfserving.service ...
May 27 22:15:52 ip-172-31-85-204 docker[33587]:  Container tfserving Started
May 27 22:15:52 ip-172-31-85-204 systemd[1]: Finished tfserving.service ...

Check the model server version:

docker exec tfserving tensorflow_model_server --version

Output:

TensorFlow ModelServer: 2.19.1-rc0
TensorFlow Library: 2.19.1

Querying the model server

Health check (basic-auth gated, port 80)

PASS=$(sudo awk -F= '/^password=/{print $2}' /root/tensorflow-serving-credentials.txt)
curl -s -u "cloudimg:${PASS}" http://127.0.0.1/v1/models/half_plus_two | python3 -m json.tool

Output:

{
    "model_version_status": [
        {
            "version": "1",
            "state": "AVAILABLE",
            "status": {
                "error_code": "OK",
                "error_message": ""
            }
        }
    ]
}

REST predict (half_plus_two: y = 0.5x + 2)

PASS=$(sudo awk -F= '/^password=/{print $2}' /root/tensorflow-serving-credentials.txt)
curl -s -u "cloudimg:${PASS}" \
  -X POST -H 'Content-Type: application/json' \
  -d '{"instances": [1.0, 2.0, 5.0]}' \
  http://127.0.0.1/v1/models/half_plus_two:predict | python3 -m json.tool

Output:

{
    "predictions": [
        2.5,
        3.0,
        4.5
    ]
}

Raw REST endpoint (port 8501, unauthenticated)

Port 8501 is also published directly on the host for clients that need to bypass nginx:

curl -s http://127.0.0.1:8501/v1/models/half_plus_two | python3 -m json.tool

gRPC endpoint (port 8500, unauthenticated)

The gRPC endpoint is available on port 8500. TF Serving has no built-in gRPC authentication -- front it with an API gateway or service mesh for public-facing workloads.

Verify the port is open:

nc -zv 127.0.0.1 8500

Output:

Connection to 127.0.0.1 8500 port [tcp/*] succeeded!

Serving your own SavedModel

TF Serving expects models in the layout <model_base>/<model_name>/<version>/saved_model.pb.

  1. Copy your SavedModel to the data volume:
sudo mkdir -p /var/lib/tfserving/models/my_model/1/
sudo cp -r /path/to/saved_model.pb /var/lib/tfserving/models/my_model/1/
sudo chown -R root:root /var/lib/tfserving/models/my_model/
  1. Update the compose file to load your model name:
sudo sed -i 's/MODEL_NAME: half_plus_two/MODEL_NAME: my_model/' /opt/tfserving/docker-compose.yml
  1. Restart the stack:
sudo systemctl restart tfserving.service
  1. Query your model:
PASS=$(sudo awk -F= '/^password=/{print $2}' /root/tensorflow-serving-credentials.txt)
curl -s -u "cloudimg:${PASS}" http://127.0.0.1/v1/models/my_model | python3 -m json.tool

Model storage volume

Model files live on a dedicated 20 GiB gp3 EBS volume mounted at /var/lib/tfserving. This volume is independently resizable from the OS disk.

df -h /var/lib/tfserving

Output:

Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme1n1     20G   64K   19G   1% /var/lib/tfserving

To expand the volume, resize it in the EC2 console and then run sudo resize2fs /dev/nvme1n1.

Managing the service

Start, stop, or restart TensorFlow Serving:

sudo systemctl start tfserving.service
sudo systemctl stop tfserving.service
sudo systemctl restart tfserving.service

View container logs:

docker logs tfserving --tail 50

Enabling TLS

For production use, terminate TLS at nginx with a certificate from Let's Encrypt or your own CA:

  1. Install certbot: sudo apt install certbot python3-certbot-nginx
  2. Obtain a certificate: sudo certbot --nginx -d your-domain.example.com
  3. Certbot will update the nginx site automatically.

Security recommendations

  • Ports 8500 and 8501 are published unauthenticated on the host. Restrict the security group rules for those ports to trusted internal CIDRs, or remove them and route all traffic through nginx on port 80.
  • The nginx basic-auth password is stored in /root/tensorflow-serving-credentials.txt (root-only). For production deployments, replace Basic auth with a more robust mechanism such as mTLS, OAuth2 proxy, or an API gateway.

Stopping and restarting on boot

TensorFlow Serving and nginx are enabled on boot via systemd. If you need to prevent the model server from starting on the next boot:

sudo systemctl disable tfserving.service

Support

This image is supported by cloudimg. For technical assistance, contact support@cloudimg.co.uk.