Databases Azure

DuckDB 1.5 with Web UI on Ubuntu 24.04 on Azure User Guide

| Product: DuckDB 1.5 on Ubuntu 24.04 LTS on Azure

Overview

This guide covers the deployment and configuration of DuckDB 1.5.2 with its built-in Web UI on Ubuntu 24.04 LTS on Azure using cloudimg Azure Marketplace images. DuckDB is the in-process analytical database engine that reads Parquet, CSV, JSON, Iceberg, and Delta Lake tables directly without an extract-and-load step. The 1.5 series ships the integrated Web UI as a first-class experience — a SQL editor, schema browser, query plan visualiser, and notebook-style cell layout — served by the DuckDB process itself.

The image installs the official DuckDB 1.5.2 CLI binary at /usr/local/bin/duckdb and runs the Web UI as duckdb-ui.service, a long-lived Python wrapper that holds the persistent database open and serves the SPA on 127.0.0.1:4213. nginx fronts the UI on port 80 with HTTP basic auth — the user is duckdb, the password is rotated on first boot to a 32-character random value and written to /stage/scripts/duckdb-credentials.log (mode 0600, root only).

The DuckDB UI extension is pre-fetched and warmed during build time, so the first customer boot has zero outbound dependencies on community.duckdb.org or the extension CDN. The persistent database lives at /var/lib/duckdb/main.duckdb owned by the duckdb system user.

What is included:

  • DuckDB 1.5.2 CLI binary at /usr/local/bin/duckdb — official linux amd64 release tarball, SHA256 verified at install time

  • DuckDB UI extension pre-warmed into /var/lib/duckdb/.duckdb/extensions/v1.5.2/linux_amd64/ui.duckdb_extension

  • duckdb-ui.service systemd unit running as the duckdb system user, holding the persistent database open and serving the UI on 127.0.0.1:4213

  • nginx 1.24 reverse proxy on port 80 with HTTP basic auth, WebSocket upgrade for live query results, and a no-auth /healthz endpoint for load balancer health checks

  • duckdb-firstboot.service systemd oneshot that rotates the UI basic-auth password on first customer boot and writes credentials to /stage/scripts/duckdb-credentials.log (mode 0600 root only)

  • Persistent database at /var/lib/duckdb/main.duckdb owned duckdb:duckdb

  • Python 3.12 venv at /opt/duckdb-venv with the duckdb Python module pinned to 1.5.2 (used by the UI server wrapper; available for any Python analytics work)

  • Bundled extensions on first launch: httpfs, parquet, json, icu, autocomplete, core_functions, jemalloc, shell, ui

  • Ubuntu 24.04 LTS (Noble Numbat) base with latest security patches applied at build time

  • Azure Linux Agent for seamless cloud integration and SSH key injection

  • 24/7 cloudimg support with guaranteed 24 hour response SLA

Prerequisites

  • An active Azure subscription

  • A subscription to the DuckDB 1.5 on Ubuntu 24.04 listing on Azure Marketplace

  • An SSH public key for VM authentication

  • A virtual network and subnet in the target region

Recommended virtual machine size: Standard_D2s_v5 (2 vCPU, 8 GB RAM) for production analytical workloads — DuckDB benefits from L2/L3 cache and consistent single-thread performance, and the D-series gives both. Standard_B2s (2 vCPU, 4 GB RAM) is fine for evaluation and dev. For datasets over a few GB, attach a Premium SSD data disk and move /var/lib/duckdb/main.duckdb onto it before populating the database.

Step 1: Deploy from the Azure Portal

Navigate to Marketplace in the Azure Portal, search for DuckDB, select the cloudimg publisher entry, and click Create.

On the Networking tab attach a network security group that allows inbound TCP 22 from your management IP range and TCP 80 from your application VPN, jump host, or an Azure Application Gateway / Front Door. Do not expose port 80 directly to the public internet without TLS — the UI uses HTTP basic auth which is fine over a private network or behind a TLS-terminating front door, but you should not send a basic-auth password over plain HTTP across the public internet.

Click Review + create, wait for validation, then Create. Deployment takes around two minutes — the duckdb-firstboot service rotates the UI password and writes the credentials file before the VM is reported as ready.

Step 2: Deploy from the Azure CLI

RG="duckdb-prod"
LOCATION="eastus"
VM_NAME="duckdb-01"
ADMIN_USER="azureuser"
GALLERY_IMAGE_ID="/subscriptions/<sub-id>/resourceGroups/azure-cloudimg/providers/Microsoft.Compute/galleries/cloudimgGallery/images/duckdb-1-5-ubuntu-24-04/versions/<version>"
SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"

az group create --name "$RG" --location "$LOCATION"

az network vnet create --resource-group "$RG" --name duckdb-vnet \
  --address-prefix 10.74.0.0/16 --subnet-name duckdb-subnet --subnet-prefix 10.74.1.0/24

az network nsg create --resource-group "$RG" --name duckdb-nsg

az network nsg rule create --resource-group "$RG" --nsg-name duckdb-nsg \
  --name allow-ssh --priority 100 \
  --source-address-prefixes "<your-mgmt-cidr>" \
  --destination-port-ranges 22 --access Allow --protocol Tcp

az network nsg rule create --resource-group "$RG" --nsg-name duckdb-nsg \
  --name allow-ui --priority 110 \
  --source-address-prefixes 10.74.0.0/16 \
  --destination-port-ranges 80 --access Allow --protocol Tcp

az vm create \
  --resource-group "$RG" --name "$VM_NAME" \
  --image "$GALLERY_IMAGE_ID" \
  --size Standard_D2s_v5 --storage-sku StandardSSD_LRS \
  --admin-username "$ADMIN_USER" --ssh-key-values "$SSH_KEY" \
  --vnet-name duckdb-vnet --subnet duckdb-subnet --nsg duckdb-nsg \
  --public-ip-sku Standard

Step 3: Connect via SSH

ssh azureuser@<vm-ip>

duckdb-ui.service and nginx.service will already be running, and duckdb-firstboot.service will already have rotated the UI basic-auth password and written the credentials file.

Step 4: Verify the DuckDB UI Service

sudo systemctl status duckdb-ui.service --no-pager

Expected: active (running). Confirm the firstboot sentinel:

sudo test -f /var/lib/cloudimg/duckdb-firstboot.done && echo FIRSTBOOT_DONE

Confirm the upstream UI listener on 127.0.0.1:4213 and the nginx front door on :80:

sudo ss -tln | grep -E ':(80|4213) '

Expected output:

LISTEN 0      511          0.0.0.0:80        0.0.0.0:*
LISTEN 0      5          127.0.0.1:4213      0.0.0.0:*
LISTEN 0      511             [::]:80           [::]:*

The unauthenticated /healthz endpoint is for load balancer health checks:

curl -fsS http://127.0.0.1/healthz

Expected: ok.

Step 5: Retrieve the UI Basic-Auth Password

The UI password has been rotated on this specific virtual machine and written to a root-only file. Read it with:

sudo cat /stage/scripts/duckdb-credentials.log

You will see lines similar to:

# DuckDB 1.5 — Per-VM Credentials
# Generated: Fri May  1 08:07:07 UTC 2026
#
# Web UI:
#   URL:      http://10.0.1.14/
#   Username: duckdb
#   Password: <DUCKDB_UI_PASSWORD>
#
DUCKDB_UI_USER=duckdb
DUCKDB_UI_PASSWORD=<DUCKDB_UI_PASSWORD>
DUCKDB_UI_URL=http://<vm-ip>/
DUCKDB_DB_PATH=/var/lib/duckdb/main.duckdb
LISTEN_PORT=80
UI_UPSTREAM_PORT=4213

The DUCKDB_UI_URL line records the IP that Azure IMDS reported at first boot — for Standard SKU public IPs, IMDS does not expose the public address, so the file falls back to the private IP. Use your VM's actual public IP (the one you SSH to) when opening the UI from outside the VNet. Store the password in your secret store.

Step 6: Open the DuckDB Web UI

Open http://<vm-ip>/ in your browser. nginx will challenge for HTTP basic auth — username duckdb, password from the credentials file. The DuckDB UI single-page app will load. The default layout shows:

  • A SQL editor taking up most of the left and centre of the screen with syntax highlighting and autocomplete
  • A schema browser on the right listing attached databases (main is the default catalog backed by /var/lib/duckdb/main.duckdb), schemas, tables, and columns
  • A query results pane below the editor that renders tables, plots, and query plans depending on which tab is active
  • A notebook-style cell layout if you prefer chained queries — each cell can hold its own SQL statement

The UI talks to the embedded HTTP server inside the running DuckDB process via WebSockets, so query results stream live as they execute. File uploads (Parquet, CSV, JSON, Excel, Arrow) become attached tables instantly when dropped into the upload area at the top of the schema browser.

Step 7: Run a SQL Probe in the Web UI

In the SQL editor, paste and execute:

SELECT version();
CREATE OR REPLACE TABLE hello AS SELECT range AS i, range * 2 AS d FROM range(1000);
SELECT count(*), sum(d) FROM hello;

The version cell will return v1.5.2, and the count/sum cell will return 1000 / 999000. The hello table is now persisted in /var/lib/duckdb/main.duckdb and survives reboots.

Step 8: CLI Access for Power Users

From SSH, you can drop into a DuckDB CLI session against any DuckDB file. Do not open /var/lib/duckdb/main.duckdb from the CLI while the UI service is running — DuckDB enforces a single-writer lock and the UI process holds it. Either stop the service first, or use a separate database file for CLI work:

# Use a scratch database — leaves the UI's main.duckdb alone
sudo -u duckdb /usr/local/bin/duckdb /tmp/scratch.duckdb \
  -c "INSTALL httpfs; LOAD httpfs;" \
  -c "SELECT count(*) AS rows FROM read_parquet('https://blobs.duckdb.org/data/taxi_2019_04.parquet') LIMIT 1;"

Expected output:

┌───────────┐
│   rows    │
│   int64   │
├───────────┤
│   7433139 │
└───────────┘

To CLI into the same main.duckdb the UI is using, stop the service first:

sudo systemctl stop duckdb-ui.service
sudo -u duckdb /usr/local/bin/duckdb /var/lib/duckdb/main.duckdb -c "SHOW TABLES;"
sudo systemctl start duckdb-ui.service

Step 9: Read Parquet from Azure Blob Storage

The httpfs and azure extensions let DuckDB read Parquet, CSV, and JSON directly from Azure Blob Storage with no intermediate copy. From either the Web UI SQL editor or the CLI:

INSTALL azure;
LOAD azure;

-- For a public container, no credentials needed:
SELECT count(*)
FROM read_parquet('azure://<container>/<path>/*.parquet');

-- For a private container, set the connection string first:
CREATE SECRET azure_blob (
    TYPE AZURE,
    CONNECTION_STRING 'DefaultEndpointsProtocol=https;AccountName=<acct>;AccountKey=<key>;EndpointSuffix=core.windows.net'
);
SELECT count(*) FROM read_parquet('azure://<container>/<path>/*.parquet');

Or via Managed Identity (recommended for production):

CREATE SECRET azure_mi (
    TYPE AZURE,
    PROVIDER CREDENTIAL_CHAIN
);
SELECT count(*) FROM read_parquet('azure://<container>/<path>/*.parquet');

The credential chain provider walks Managed Identity, Azure CLI, environment variables, and Workload Identity — pick whichever fits your deployment. Grant the VM's system-assigned identity Storage Blob Data Reader on the target storage account.

Step 10: Backup the Persistent Database

DuckDB's EXPORT DATABASE writes a portable directory of CSV or Parquet files plus a load script. Stop the UI service first to release the write lock, then export:

sudo systemctl stop duckdb-ui.service
sudo -u duckdb /usr/local/bin/duckdb /var/lib/duckdb/main.duckdb \
  -c "EXPORT DATABASE '/var/lib/duckdb/backup-$(date +%Y%m%d)' (FORMAT PARQUET);"
sudo systemctl start duckdb-ui.service

Restore from the export by running the generated load.sql against a new database:

sudo -u duckdb /usr/local/bin/duckdb /var/lib/duckdb/restored.duckdb \
  -c ".read /var/lib/duckdb/backup-20260501/load.sql"

For frequent backups, schedule the export via a systemd timer or push the backup directory to Azure Blob Storage with azcopy. DuckDB also supports ATTACH 'backup.duckdb' AS bk; COPY FROM DATABASE main TO bk; for whole-database file copies if you prefer a single artefact.

Step 11: Hardening — Put TLS in Front of the UI

The image ships nginx with HTTP basic auth on port 80. For internet-facing deployments, terminate TLS in front of the VM rather than reconfiguring nginx on the VM:

  • Azure Application Gateway with WAF v2 — terminates TLS, enforces a managed cert from Key Vault, lets you add rate limiting and IP allow-listing without touching the VM.

  • Azure Front Door Standard / Premium — gives you a global anycast endpoint, managed cert, and WAF policy.

  • Your own caddy / nginx on a separate VM — redirect 443 to the DuckDB VM's port 80, handle the cert via Let's Encrypt.

If you must terminate TLS on the VM itself, install a real cert via certbot, change the listen 80; lines in /etc/nginx/sites-available/duckdb to listen 443 ssl;, add ssl_certificate and ssl_certificate_key directives, and add a 301-redirect server block on port 80.

Step 12: Rotate the UI Password

To rotate the UI basic-auth password without rebuilding the image:

NEW_PASS=$(openssl rand -hex 16)
sudo htpasswd -bc /etc/nginx/.duckdb-htpasswd duckdb "${NEW_PASS}"
sudo systemctl reload nginx
echo "New UI password: ${NEW_PASS}"

Update /stage/scripts/duckdb-credentials.log to match. The htpasswd file is owned root:www-data mode 0640 — nginx reads it on each request, no cache flush needed beyond the reload.

Support

cloudimg provides 24/7/365 expert technical support for this image. Reach out to support@cloudimg.co.uk for help with deployment, configuration, performance tuning, or any DuckDB-specific questions. We respond to non-critical tickets within 24 hours and to critical incidents within one hour on average.

For DuckDB-specific documentation, the upstream project's website at duckdb.org/docs is excellent — start with the SQL reference and the Python API. The DuckDB community Discord and GitHub Discussions are active and welcoming.