Application Development Azure

dbt Core on Ubuntu 24.04 on Azure User Guide

| Product: dbt Core on Ubuntu 24.04 LTS on Azure

Overview

dbt Core is the open-source data transformation framework that lets analytics engineers build, test and document data pipelines using SQL and Jinja. You write select statements as models, dbt handles dependency ordering, materialisation, testing and documentation, and turns your raw data into trusted, version-controlled datasets. The cloudimg image installs dbt Core 1.9.4 into a Python virtualenv and exposes it system-wide as the dbt command, together with the self-contained dbt-duckdb adapter and a worked example project that builds, tests and documents a small data mart against an embedded DuckDB database - no external data warehouse required. This is a command-line product - there is no web UI. Backed by 24/7 cloudimg support.

What is included:

  • dbt Core 1.9.4 installed in a virtualenv at /opt/dbt/venv, on PATH as /usr/local/bin/dbt
  • The dbt-duckdb 1.9.3 adapter so the appliance runs end to end with no external warehouse
  • A worked example project at /opt/dbt/example (seeds, staging views, a mart and schema tests)
  • A per-user writable copy at ~/dbt-example for every login user
  • A login banner pointing at the example and the user guide
  • Ubuntu 24.04 LTS base with latest security patches at build time
  • 24/7 cloudimg support

Prerequisites

An active Azure subscription, an SSH key pair, and a VNet + subnet in the target region. Standard_B2s (2 vCPU / 4 GiB RAM) is plenty for a dbt workstation or CI runner. NSG inbound: allow 22/tcp from your management network. The bundled DuckDB example needs no outbound network; to manage a real warehouse, dbt connects out to your Snowflake / BigQuery / Redshift / Postgres endpoint over its standard port.

Step 1 - Deploy from the Azure Marketplace

Sign in to the Azure Portal, choose Create a resource, search the Marketplace for dbt Core by cloudimg, and select Create. On Basics pick your subscription, resource group, region and size; under Administrator account choose SSH public key and paste your key; under Inbound port rules allow SSH (22). Then Review + create -> Create.

Step 2 - Deploy from the Azure CLI

az vm create \
  --resource-group <your-rg> \
  --name dbt \
  --image <marketplace-image-urn> \
  --size Standard_B2s \
  --admin-username azureuser \
  --ssh-key-values ~/.ssh/id_ed25519.pub \
  --vnet-name <your-vnet> --subnet <your-subnet> \
  --public-ip-sku Standard

Step 3 - Connect to your VM

ssh azureuser@<vm-public-ip>

Step 4 - Confirm dbt is installed

dbt --version

It reports installed: 1.9.4 for dbt Core and lists the duckdb: 1.9.3 plugin.

Step 5 - Get your writable copy of the example

Every login user gets their own writable copy of the worked example. The DuckDB adapter writes a local database file, so the project must be user-writable - this copies the read-only reference from /opt/dbt/example:

[ -d ~/dbt-example ] || cp -r /opt/dbt/example ~/dbt-example
cd ~/dbt-example && dbt --version

The example ships with a profiles.yml whose dev target points at a local jaffle.duckdb, two CSV seeds, two staging models, a customer_orders mart and not_null / unique schema tests.

Step 6 - Load the seeds

dbt seed loads the bundled CSVs (raw_customers, raw_orders) into DuckDB:

cd ~/dbt-example && dbt seed

dbt reports Completed successfully with PASS=2.

Step 7 - Build the models

dbt run builds the staging views and the customer_orders mart in dependency order:

cd ~/dbt-example && dbt run

dbt creates stg_customers, stg_orders and the customer_orders table, reporting PASS=3.

Step 8 - Test the data

dbt test runs the schema tests declared in models/marts/schema.yml:

cd ~/dbt-example && dbt test

The not_null and unique tests pass with PASS=3 WARN=0 ERROR=0.

Step 9 - Build everything in one command

dbt build runs seeds, models and tests together in dependency order - the command you would wire into CI:

cd ~/dbt-example && dbt build

Step 10 - Generate documentation and lineage

dbt can generate a documentation site and a lineage graph describing every model, seed and test:

cd ~/dbt-example && dbt docs generate

dbt writes target/catalog.json and target/manifest.json. Serve the docs site locally with dbt docs serve (binds to 127.0.0.1:8080; tunnel over SSH to view it), or publish the target/ artifacts to your own static host.

Step 11 - Inspect your DuckDB results

The pipeline materialised a real customer_orders table you can query. List the project's models and confirm the data was built:

cd ~/dbt-example && dbt ls --resource-type model && ls -la jaffle.duckdb

You can query the DuckDB file directly from Python (import duckdb; duckdb.connect("jaffle.duckdb")) or with the DuckDB CLI.

Step 12 - Connect to your own data warehouse

The DuckDB target makes the appliance self-contained, but dbt's real power is over your production warehouse. Replace the outputs block in profiles.yml with your warehouse connection and install the matching adapter into the virtualenv. For example, for Snowflake:

jaffle_duckdb:
  target: prod
  outputs:
    prod:
      type: snowflake
      account: <your-account>
      user: <your-user>
      password: <your-password>
      role: TRANSFORMER
      database: ANALYTICS
      warehouse: TRANSFORMING
      schema: dbt
      threads: 8

Install the adapter and point dbt at the new target:

sudo /opt/dbt/venv/bin/pip install dbt-snowflake
cd ~/dbt-example && dbt run --target prod

The same pattern applies to dbt-bigquery, dbt-redshift and dbt-postgres.

Maintenance

  • Reference template: the read-only example lives at /opt/dbt/example; your writable copy is ~/dbt-example.
  • Virtualenv: dbt and its adapters live in /opt/dbt/venv; add adapters with sudo /opt/dbt/venv/bin/pip install dbt-<adapter>.
  • Upgrades: upgrade in place with sudo /opt/dbt/venv/bin/pip install --upgrade dbt-core dbt-duckdb.
  • Profiles: dbt reads profiles.yml from the project directory by default; set DBT_PROFILES_DIR to relocate it (e.g. ~/.dbt).
  • Security patches: unattended-upgrades remains enabled so the OS continues to receive security updates automatically.

Support

cloudimg provides 24/7 expert support for this image. Contact support@cloudimg.co.uk.