Application Development Azure

Trino on Ubuntu 24.04 on Azure User Guide

| Product: Trino on Ubuntu 24.04 LTS on Azure

Overview

Trino (formerly PrestoSQL) is a fast, distributed SQL query engine for big-data analytics. It federates queries across data sources where the data lives, so you can run interactive SQL over object storage, relational databases, NoSQL stores and streaming systems without first copying everything into a warehouse. The cloudimg image installs Trino 476 at /opt/trino as a single-node deployment (coordinator and worker on one node), runs it as a dedicated trino system user, binds the HTTP server to loopback behind an nginx reverse proxy on TCP 80 with HTTP Basic auth, persists Trino's data directory on a dedicated Azure data disk, and generates a unique web password on the first boot of every VM. A built-in tpch catalog ships so the appliance queries real sample data out of the box. Backed by 24/7 cloudimg support.

What is included:

  • Trino 476 (single-node coordinator + worker) at /opt/trino, running on a Temurin 24 JRE
  • The Trino Web UI and REST API, fronted by nginx on :80
  • nginx HTTP Basic auth (Trino's Web UI/API is otherwise open by default) with a per-VM password in a root-only file
  • The Trino CLI at /opt/trino/bin/trino
  • A built-in tpch catalog (connector.name=tpch) with sample data so you can query immediately
  • A dedicated Azure data disk at /var/lib/trino holding Trino's data directory and spill, separate from the OS disk and re-provisioned with every VM
  • trino.service + nginx.service as systemd units, enabled and active
  • 24/7 cloudimg support

Prerequisites

An active Azure subscription, an SSH key pair, and a VNet + subnet in the target region. Standard_B4ms (4 vCPU / 16 GiB RAM) is a good starting point; Trino is a JVM query engine and benefits from RAM, so scale up for higher concurrency and larger datasets. NSG inbound: allow 22/tcp from your management network and 80/tcp for the Web UI and API (front with TLS for public exposure — see Enabling HTTPS).

Step 1 — Deploy from the Azure Marketplace

Sign in to the Azure Portal, choose Create a resource, search the Marketplace for Trino by cloudimg, and select Create. On Basics pick your subscription, resource group, region and size; under Administrator account choose SSH public key and paste your key; under Inbound port rules allow SSH (22) and HTTP (80). Review the dedicated data disk on the Disks tab, then Review + createCreate.

Step 2 — Deploy from the Azure CLI

az vm create \
  --resource-group <your-rg> \
  --name trino \
  --image <marketplace-image-urn> \
  --size Standard_B4ms \
  --admin-username azureuser \
  --ssh-key-values ~/.ssh/id_ed25519.pub \
  --vnet-name <your-vnet> --subnet <your-subnet> \
  --public-ip-sku Standard

az vm open-port --resource-group <your-rg> --name trino --port 80 --priority 1010

Step 3 — Connect to your VM

ssh azureuser@<vm-public-ip>

Step 4 — Confirm the services are running

systemctl is-active trino.service nginx.service

Both services report active. Trino is a JVM query engine, so it takes around 30–60 seconds after boot to finish starting; once started it accepts SQL immediately.

Step 5 — Retrieve your web password

The admin password is generated uniquely on the first boot of your VM and written to a root-only file:

sudo cat /root/trino-credentials.txt

This file contains trino.user (admin) and trino.password, plus the URLs for the Web UI and API. Store the password somewhere safe.

Step 6 — Check the health endpoint

nginx serves an unauthenticated health endpoint for load balancers and probes:

curl -s http://localhost/health

It returns ok.

Step 7 — Open the Trino Web UI

Browse to http://<vm-public-ip>/ui/ and sign in as admin with the password from Step 5. The Web UI shows the cluster overview — running, queued and finished queries, plus worker and resource status. Submit a query (see Step 8 or Step 9) and it appears in the query list; click any query to inspect its stages, splits and timeline.

Trino Web UI cluster overview with the query list

Click a finished query to drill into its detail — the execution stages, operator pipeline and timing:

Trino Web UI finished query detail showing stages

The cluster view reports the active worker, its node status and the running build:

Trino Web UI cluster and worker status

A completed tpch query appears in the query list with its state, elapsed time and rows processed:

Trino Web UI showing a completed tpch query

Step 8 — Query with the REST API

The REST API is available behind the same Basic auth. Confirm Trino is serving and report its version and readiness:

curl -s -u admin:<TRINO_ADMIN_PASSWORD> http://localhost/v1/info; echo

You get a JSON response whose nodeVersion.version is 476 and starting is false once the engine is fully up.

Step 9 — Run a SQL query with the Trino CLI

The Trino CLI ships at /opt/trino/bin/trino. The built-in tpch catalog exposes the standard TPC-H schemas (tiny, sf1, sf100, …) so you can query real sample data with no external data source. Run a query against tpch.tiny:

/opt/trino/bin/trino --server http://127.0.0.1:8080 --catalog tpch --schema tiny --execute 'SELECT count(*) FROM nation'

It returns 25. Try a richer query — total order value by market segment:

/opt/trino/bin/trino --server http://127.0.0.1:8080 --catalog tpch --schema tiny --execute "SELECT c.mktsegment, round(sum(o.totalprice), 2) AS revenue FROM orders o JOIN customer c ON o.custkey = c.custkey GROUP BY c.mktsegment ORDER BY revenue DESC"

You can also start an interactive session by omitting --execute:

/opt/trino/bin/trino --server http://127.0.0.1:8080 --catalog tpch --schema tiny
trino:tiny> SHOW SCHEMAS;
trino:tiny> SELECT name FROM region ORDER BY regionkey;
trino:tiny> quit;

Step 10 — Connect BI tools and applications

Trino speaks the standard Trino/Presto protocol over HTTP, so most BI and analytics tools connect through the official JDBC/ODBC drivers or a native Trino connector. Point the client at the VM on port 80, authenticate with user admin and the password from Step 5, and select the tpch catalog (or any catalog you add). For example, the JDBC URL is:

jdbc:trino://<vm-public-ip>:80/tpch/tiny

with username admin and the per-VM password. Tools such as DBeaver, Superset, Tableau, Power BI and Metabase all connect this way. For production use, terminate TLS in front of nginx (see Enabling HTTPS) and use jdbc:trino://<host>:443/... with SSL enabled. To query your own data, add a catalog file under /opt/trino/etc/catalog/ (for example a PostgreSQL, MySQL, Hive or Iceberg connector) and restart Trino.

Step 11 — Confirm data lives on the dedicated disk

Trino's data directory is stored on the dedicated Azure data disk so it survives OS changes and can be resized independently:

findmnt /var/lib/trino

The mount is backed by a separate Azure data disk captured into the image and re-provisioned on every VM.

Enabling HTTPS

The nginx reverse proxy terminates plain HTTP on port 80. For public exposure, put a certificate in front of it. The simplest path is to add a DNS name for the VM and use the companion cloudimg nginx-ssl-certbot image as a TLS reverse proxy, or install certbot and extend the existing nginx site with a listen 443 ssl; server block and your certificate paths. Keep Trino itself bound to loopback so the only public surface is the authenticated, TLS-terminated proxy.

Maintenance

  • Configuration: Trino's config lives under /opt/trino/etc/ (config.properties, node.properties, jvm.config) and catalogs under /opt/trino/etc/catalog/. Edit a file and sudo systemctl restart trino to apply changes.
  • Adding data sources: drop a <name>.properties catalog file in /opt/trino/etc/catalog/ (e.g. connector.name=postgresql with the JDBC URL and credentials) and restart Trino; the catalog appears immediately to clients.
  • Backups: snapshot the /var/lib/trino data disk. Trino itself is stateless for query results; persisted state is the catalog configuration under /opt/trino/etc.
  • Upgrades: replace the contents of /opt/trino with a newer release (matching the required JDK) and restart the service.
  • Security patches: unattended-upgrades remains enabled so the OS continues to receive security updates automatically.

Support

cloudimg provides 24/7 expert support for this image. Contact support@cloudimg.co.uk.

Trino is a trademark of the Trino Software Foundation. This image is produced by cloudimg and is not affiliated with or endorsed by the Trino project. Trino is distributed under the Apache License 2.0.