Machine Learning AWS

Label Studio on AWS User Guide

| Product: Label Studio on AWS

Overview

This image runs Label Studio 1.23, the open source data labeling and annotation platform for machine learning - label text, images, audio, video and time series and export training datasets - on Ubuntu 24.04 LTS. Label Studio is installed into a dedicated Python virtual environment under /opt/label-studio on Python 3.12 and run by an unprivileged labelstudio system account under a systemd service that starts the server on boot, migrating its database automatically.

The server listens on port 8080; nginx fronts it on port 80 with HTTP Basic Authentication. The unauthenticated /health probe stays open; everything else requires the password. The default security group opens port 22 (SSH) and port 80 (HTTP) only, so 8080 is not reachable externally.

On the first boot of every deployed instance a one-shot service generates a fresh password, unique to that instance, and writes it to /root/label-studio-credentials.txt (mode 0600, root only). The same password secures the HTTP gate and the Label Studio administrator account (admin@cloudimg.local). The SQLite database, uploads and exports live under /var/lib/label-studio on a dedicated, independently resizable EBS data volume.

Prerequisites

  • An AWS account subscribed to this product in AWS Marketplace.
  • An EC2 key pair in your target region for SSH access.
  • A security group allowing inbound TCP 22 (SSH) from your IP and TCP 80 (HTTP) from your users.
  • Recommended instance type: m5.large or larger.

Connecting to your instance

OS variant Login user Example
Ubuntu 24.04 ubuntu ssh -i your-key.pem ubuntu@<instance-public-ip>

Step 1 - Launch from the AWS Marketplace console

  1. Open the product page in AWS Marketplace and choose Continue to Subscribe, then Continue to Configuration.
  2. Select the Label Studio 1.23 on Ubuntu 24.04 delivery option and your region, then Continue to Launch.
  3. Choose your instance type, VPC/subnet, key pair and the security group described above, and launch.

Step 2 - Launch from the AWS CLI

aws ec2 run-instances \
  --image-id ami-xxxxxxxxxxxxxxxxx \
  --instance-type m5.large \
  --key-name your-key \
  --security-group-ids sg-xxxxxxxx \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=label-studio}]'

Step 3 - Connect to your instance

ssh -i your-key.pem ubuntu@<instance-public-ip>

Step 4 - Confirm the services are running

systemctl is-active label-studio.service nginx.service
curl -s http://127.0.0.1/health

Expected output:

active
active
{"status": "UP"}

Step 5 - Retrieve your password

sudo cat /root/label-studio-credentials.txt
# Label Studio - generated on first boot by label-studio-firstboot.service
LABEL_STUDIO_URL=http://<instance-public-ip>/
LABEL_STUDIO_USERNAME=admin@cloudimg.local
LABEL_STUDIO_PASSWORD=<your-unique-password>

Step 6 - Sign in

Browse to http://<instance-public-ip>/. The browser first prompts for the HTTP gate - enter user admin and the password from Step 5. Then sign in to Label Studio with the email admin@cloudimg.local and the same password.

Label Studio sign-in, served through the nginx reverse proxy and protected by a per-instance password

Step 7 - Create a labeling project

  1. Choose Create Project, name it, and import data (drag and drop files, paste URLs, or connect cloud storage).
  2. In Labeling Setup, pick a template - text classification, named entity recognition, image bounding boxes, audio transcription, and many more - or build a custom config.
  3. Start labeling. Annotations are saved as you go and can be exported in JSON, CSV, COCO, YOLO and other formats from the Export menu.

Connect a machine learning backend (an external model) under Settings > Model to pre-label data and accelerate annotation.

Step 8 - Confirm the runtime

/opt/label-studio/venv/bin/pip show label-studio | grep ^Version
Version: 1.23.0

Production scale - PostgreSQL and Amazon S3

The image defaults to SQLite and local storage on the data volume. For team scale, set PostgreSQL and S3 in /etc/label-studio/label-studio.env (see the Label Studio docs for DJANGO_DB / POSTGRE_* and cloud storage settings), then sudo systemctl restart label-studio.service. Attach an instance role granting access to the S3 bucket.

Enabling HTTPS

sudo apt-get update && sudo apt-get install -y certbot python3-certbot-nginx
sudo certbot --nginx -d your-domain.example.com

certbot edits the nginx site at /etc/nginx/sites-available/cloudimg-label-studio to add the TLS listener and arranges automatic renewal.

Backup and maintenance

  • All Label Studio state lives under /var/lib/label-studio (the SQLite database, uploads and exports) on its own EBS volume. Snapshot that volume to back up projects and annotations.
  • The HTTP gate password is in /etc/nginx/.label-studio.htpasswd; rotate it with sudo htpasswd /etc/nginx/.label-studio.htpasswd admin.
  • Restart with sudo systemctl restart label-studio.service; logs: sudo journalctl -u label-studio.service.

Support

cloudimg provides 24/7 technical support for this image by email and chat, covering Label Studio deployment, labeling configuration, machine learning backends, database and storage configuration, TLS termination and scaling. Contact details are on the AWS Marketplace listing.