Trino on Ubuntu 24.04 on Azure User Guide
Overview
Trino (formerly PrestoSQL) is a fast, distributed SQL query engine for big-data analytics. It federates queries across data sources where the data lives, so you can run interactive SQL over object storage, relational databases, NoSQL stores and streaming systems without first copying everything into a warehouse. The cloudimg image installs Trino 476 at /opt/trino as a single-node deployment (coordinator and worker on one node), runs it as a dedicated trino system user, binds the HTTP server to loopback behind an nginx reverse proxy on TCP 80 with HTTP Basic auth, persists Trino's data directory on a dedicated Azure data disk, and generates a unique web password on the first boot of every VM. A built-in tpch catalog ships so the appliance queries real sample data out of the box. Backed by 24/7 cloudimg support.
What is included:
- Trino 476 (single-node coordinator + worker) at
/opt/trino, running on a Temurin 24 JRE - The Trino Web UI and REST API, fronted by nginx on
:80 - nginx HTTP Basic auth (Trino's Web UI/API is otherwise open by default) with a per-VM password in a root-only file
- The Trino CLI at
/opt/trino/bin/trino - A built-in
tpchcatalog (connector.name=tpch) with sample data so you can query immediately - A dedicated Azure data disk at
/var/lib/trinoholding Trino's data directory and spill, separate from the OS disk and re-provisioned with every VM trino.service+nginx.serviceas systemd units, enabled and active- 24/7 cloudimg support
Prerequisites
An active Azure subscription, an SSH key pair, and a VNet + subnet in the target region. Standard_B4ms (4 vCPU / 16 GiB RAM) is a good starting point; Trino is a JVM query engine and benefits from RAM, so scale up for higher concurrency and larger datasets. NSG inbound: allow 22/tcp from your management network and 80/tcp for the Web UI and API (front with TLS for public exposure — see Enabling HTTPS).
Step 1 — Deploy from the Azure Marketplace
Sign in to the Azure Portal, choose Create a resource, search the Marketplace for Trino by cloudimg, and select Create. On Basics pick your subscription, resource group, region and size; under Administrator account choose SSH public key and paste your key; under Inbound port rules allow SSH (22) and HTTP (80). Review the dedicated data disk on the Disks tab, then Review + create → Create.
Step 2 — Deploy from the Azure CLI
az vm create \
--resource-group <your-rg> \
--name trino \
--image <marketplace-image-urn> \
--size Standard_B4ms \
--admin-username azureuser \
--ssh-key-values ~/.ssh/id_ed25519.pub \
--vnet-name <your-vnet> --subnet <your-subnet> \
--public-ip-sku Standard
az vm open-port --resource-group <your-rg> --name trino --port 80 --priority 1010
Step 3 — Connect to your VM
ssh azureuser@<vm-public-ip>
Step 4 — Confirm the services are running
systemctl is-active trino.service nginx.service
Both services report active. Trino is a JVM query engine, so it takes around 30–60 seconds after boot to finish starting; once started it accepts SQL immediately.
Step 5 — Retrieve your web password
The admin password is generated uniquely on the first boot of your VM and written to a root-only file:
sudo cat /root/trino-credentials.txt
This file contains trino.user (admin) and trino.password, plus the URLs for the Web UI and API. Store the password somewhere safe.
Step 6 — Check the health endpoint
nginx serves an unauthenticated health endpoint for load balancers and probes:
curl -s http://localhost/health
It returns ok.
Step 7 — Open the Trino Web UI
Browse to http://<vm-public-ip>/ui/ and sign in as admin with the password from Step 5. The Web UI shows the cluster overview — running, queued and finished queries, plus worker and resource status. Submit a query (see Step 8 or Step 9) and it appears in the query list; click any query to inspect its stages, splits and timeline.

Click a finished query to drill into its detail — the execution stages, operator pipeline and timing:

The cluster view reports the active worker, its node status and the running build:

A completed tpch query appears in the query list with its state, elapsed time and rows processed:

Step 8 — Query with the REST API
The REST API is available behind the same Basic auth. Confirm Trino is serving and report its version and readiness:
curl -s -u admin:<TRINO_ADMIN_PASSWORD> http://localhost/v1/info; echo
You get a JSON response whose nodeVersion.version is 476 and starting is false once the engine is fully up.
Step 9 — Run a SQL query with the Trino CLI
The Trino CLI ships at /opt/trino/bin/trino. The built-in tpch catalog exposes the standard TPC-H schemas (tiny, sf1, sf100, …) so you can query real sample data with no external data source. Run a query against tpch.tiny:
/opt/trino/bin/trino --server http://127.0.0.1:8080 --catalog tpch --schema tiny --execute 'SELECT count(*) FROM nation'
It returns 25. Try a richer query — total order value by market segment:
/opt/trino/bin/trino --server http://127.0.0.1:8080 --catalog tpch --schema tiny --execute "SELECT c.mktsegment, round(sum(o.totalprice), 2) AS revenue FROM orders o JOIN customer c ON o.custkey = c.custkey GROUP BY c.mktsegment ORDER BY revenue DESC"
You can also start an interactive session by omitting --execute:
/opt/trino/bin/trino --server http://127.0.0.1:8080 --catalog tpch --schema tiny
trino:tiny> SHOW SCHEMAS;
trino:tiny> SELECT name FROM region ORDER BY regionkey;
trino:tiny> quit;
Step 10 — Connect BI tools and applications
Trino speaks the standard Trino/Presto protocol over HTTP, so most BI and analytics tools connect through the official JDBC/ODBC drivers or a native Trino connector. Point the client at the VM on port 80, authenticate with user admin and the password from Step 5, and select the tpch catalog (or any catalog you add). For example, the JDBC URL is:
jdbc:trino://<vm-public-ip>:80/tpch/tiny
with username admin and the per-VM password. Tools such as DBeaver, Superset, Tableau, Power BI and Metabase all connect this way. For production use, terminate TLS in front of nginx (see Enabling HTTPS) and use jdbc:trino://<host>:443/... with SSL enabled. To query your own data, add a catalog file under /opt/trino/etc/catalog/ (for example a PostgreSQL, MySQL, Hive or Iceberg connector) and restart Trino.
Step 11 — Confirm data lives on the dedicated disk
Trino's data directory is stored on the dedicated Azure data disk so it survives OS changes and can be resized independently:
findmnt /var/lib/trino
The mount is backed by a separate Azure data disk captured into the image and re-provisioned on every VM.
Enabling HTTPS
The nginx reverse proxy terminates plain HTTP on port 80. For public exposure, put a certificate in front of it. The simplest path is to add a DNS name for the VM and use the companion cloudimg nginx-ssl-certbot image as a TLS reverse proxy, or install certbot and extend the existing nginx site with a listen 443 ssl; server block and your certificate paths. Keep Trino itself bound to loopback so the only public surface is the authenticated, TLS-terminated proxy.
Maintenance
- Configuration: Trino's config lives under
/opt/trino/etc/(config.properties,node.properties,jvm.config) and catalogs under/opt/trino/etc/catalog/. Edit a file andsudo systemctl restart trinoto apply changes. - Adding data sources: drop a
<name>.propertiescatalog file in/opt/trino/etc/catalog/(e.g.connector.name=postgresqlwith the JDBC URL and credentials) and restart Trino; the catalog appears immediately to clients. - Backups: snapshot the
/var/lib/trinodata disk. Trino itself is stateless for query results; persisted state is the catalog configuration under/opt/trino/etc. - Upgrades: replace the contents of
/opt/trinowith a newer release (matching the required JDK) and restart the service. - Security patches: unattended-upgrades remains enabled so the OS continues to receive security updates automatically.
Support
cloudimg provides 24/7 expert support for this image. Contact support@cloudimg.co.uk.
Trino is a trademark of the Trino Software Foundation. This image is produced by cloudimg and is not affiliated with or endorsed by the Trino project. Trino is distributed under the Apache License 2.0.