Databases Azure

Qdrant 1.16 on Ubuntu 22.04 on Azure User Guide

Last updated: 2026-04-19 | Product: Qdrant 1.16 on Ubuntu 22.04 on Azure

Overview

This image runs Qdrant 1.16 as a single node vector similarity search engine on Ubuntu 22.04 LTS. Qdrant is an open source engine written in Rust, licensed under Apache 2.0, purpose built for AI and machine learning workloads where dense vector embeddings drive ranking. The storage engine uses HNSW graphs for approximate nearest neighbour search with metadata filtering applied during graph traversal, so selective queries stay fast even on collections with millions of points. The server is a single statically compiled Rust executable with no Java, no Python, and no other embedded runtime, which means the image has zero runtime CVE surface from bundled virtual machines or interpreters.

Authentication is enforced with a single opaque api_key on every HTTP and gRPC call. On the very first boot of every deployed virtual machine, a unique 64 character hexadecimal key is generated, written to the Qdrant configuration file, and recorded in a single credentials file readable only by root. Two virtual machines launched from the same gallery image never share a key. Every API call must include either the api-key header or an Authorization: Bearer header with the same value; requests without it are rejected with HTTP 401.

The HTTP API listens on port 6333 and the gRPC API listens on port 6334. The same server process binds both ports. REST clients and the built in web dashboard go over 6333; production client SDKs in Python, Go, Rust, Java, and JavaScript typically use 6334 for throughput. The web dashboard is served from the HTTP port at /dashboard/ and bundles its own static assets — no separate web server to install. The static assets come from the upstream qdrant-web-ui project and are extracted into the VM's data directory at build time.

The image is intended for teams building retrieval augmented generation pipelines, semantic search, recommendation systems, or anomaly detection, and who want a production posture single node vector store on day one without spending hours on packaging, systemd plumbing, api_key bootstrap, or dashboard installation. It is not a replicated cluster, it is not TLS encrypted out of the box, and the dashboard assets bundled here are pinned at build time. Section 18 covers TLS termination and the security posture you want before putting customer traffic through the server.

The brand is lowercase cloudimg throughout this guide. All cloudimg URLs in this guide use the form https://www.cloudimg.co.uk.

Prerequisites

Before you deploy this image you need:

A Microsoft Azure subscription where you can create resource groups, virtual networks, and virtual machines
Azure role permissions equivalent to Contributor on the target resource group
An SSH public key for first login to the admin user account
A virtual network and subnet in the same region as the Azure Compute Gallery the image is published into, with an associated network security group
The Azure CLI (az version 2.50 or later) installed locally if you intend to use the CLI deployment path in Section 2
The cloudimg Qdrant offer enabled on your tenant in Azure Marketplace

Step 1: Deploy the Virtual Machine from the Azure Portal

Navigate to Marketplace in the Azure Portal, search for Qdrant, and select the cloudimg publisher entry. Click Create to begin the wizard.

On the Basics tab choose your subscription, target resource group, and region. The region must match the region your Azure Compute Gallery exposes the image in. Set the virtual machine name. Choose SSH public key as the authentication type, set the username to a name of your choice, and paste your SSH public key. Standard_D4s_v3 is the recommended starting size because the HNSW index lives in memory and vector search is memory intensive. A collection of one million 768 dimensional vectors needs roughly 3 GB of RAM just for the index, before payloads and query working set; the 16 GB RAM profile of D4s_v3 covers most production workloads comfortably. Scale up to Standard_D8s_v3 or Standard_E4s_v3 (memory optimised) for collections beyond a few million vectors or for high concurrent query loads.

On the Disks tab the recommended OS disk type is Standard SSD. The Qdrant data directory lives at /var/lib/qdrant and stores vectors, HNSW graph segments, JSON payloads, and snapshots. If you expect to ingest more than a few gigabytes per day, attach a separate Premium SSD data disk now and follow Section 17 after the server is running to move the data directory across.

On the Networking tab select your existing virtual network and subnet. Attach a network security group that allows inbound TCP 22 from your management IP range, inbound TCP 6333 only from the virtual network CIDR or the specific application server subnets that need the REST API and dashboard, and optionally inbound TCP 6334 from the same subnets for the gRPC clients. Do not expose 6333 or 6334 to the public internet. The api_key is strong, but transport is plain HTTP unless you terminate TLS in front of the server as described in Section 18, and an exposed vector store with a scraped api_key is an indefinite compromise of whatever embeddings you have stored.

On the Management, Monitoring, and Advanced tabs the defaults are appropriate. Click Review + create, wait for validation to pass, then click Create. Deployment takes around two minutes.

Step 2: Deploy the Virtual Machine from the Azure CLI

If you prefer the command line, use the gallery image resource identifier as the source. The exact resource identifier is published on your Partner Center plan. A representative invocation:

RG="qdrant-prod"
LOCATION="eastus"
VM_NAME="qdrant-01"
ADMIN_USER="qdrantops"
GALLERY_IMAGE_ID="/subscriptions/<sub-id>/resourceGroups/azure-cloudimg/providers/Microsoft.Compute/galleries/cloudimgGallery/images/qdrant-1-16-ubuntu-22-04/versions/<version>"
SSH_KEY="$(cat ~/.ssh/id_rsa.pub)"

az group create --name "$RG" --location "$LOCATION"

az network vnet create \
  --resource-group "$RG" \
  --name qdrant-vnet \
  --address-prefix 10.42.0.0/16 \
  --subnet-name qdrant-subnet \
  --subnet-prefix 10.42.1.0/24

az network nsg create --resource-group "$RG" --name qdrant-nsg

az network nsg rule create \
  --resource-group "$RG" --nsg-name qdrant-nsg \
  --name allow-ssh-mgmt --priority 100 \
  --source-address-prefixes "<your-mgmt-cidr>" \
  --destination-port-ranges 22 --access Allow --protocol Tcp

az network nsg rule create \
  --resource-group "$RG" --nsg-name qdrant-nsg \
  --name allow-qdrant-http --priority 110 \
  --source-address-prefixes 10.42.0.0/16 \
  --destination-port-ranges 6333 --access Allow --protocol Tcp

az network nsg rule create \
  --resource-group "$RG" --nsg-name qdrant-nsg \
  --name allow-qdrant-grpc --priority 120 \
  --source-address-prefixes 10.42.0.0/16 \
  --destination-port-ranges 6334 --access Allow --protocol Tcp

az vm create \
  --resource-group "$RG" \
  --name "$VM_NAME" \
  --image "$GALLERY_IMAGE_ID" \
  --size Standard_D4s_v3 \
  --storage-sku StandardSSD_LRS \
  --admin-username "$ADMIN_USER" \
  --ssh-key-values "$SSH_KEY" \
  --vnet-name qdrant-vnet --subnet qdrant-subnet \
  --nsg qdrant-nsg \
  --public-ip-address "" \
  --os-disk-size-gb 32

The --public-ip-address "" flag keeps the server off the public internet. Use a bastion host or your existing private connectivity to reach it.

Step 3: Connect via SSH

After deployment, find the private IP of the new virtual machine. From a host inside the same virtual network:

ssh qdrantops@<vm-ip>

The first login may take a few seconds while cloud init finalises. Once you have a shell, the server has already been started by systemd and the first boot oneshot has already generated the per VM api_key, rewritten the Qdrant config, and restarted the server.

Step 4: Retrieve the API Key

The api_key is written by the qdrant-firstboot.service systemd oneshot the very first time the virtual machine boots. It lives in a single file, readable only by root:

sudo cat /stage/scripts/qdrant-credentials.log

You will see something like:

http_port=6333
grpc_port=6334
dashboard_path=/dashboard
api_key=62380337f58d2b9829ca533af44d31c6154d52182bea98ae98129c9b5c57ade1
sample_connect=curl -H "api-key: 62380337f58d..." http://localhost:6333/healthz
sample_dashboard=http://<vm-ip>:6333/dashboard

Every customer virtual machine deployed from this image has a different api_key. The key is an opaque 64 character hexadecimal credential, not a password pair. Treat it like a secret API token: paste it into your secrets manager, do not check it into version control, and rotate it before production as described in Section 14.

For the rest of this guide, commands use the environment variable $TOKEN in place of the raw api_key. Export it once per shell session:

TOKEN="$(sudo awk -F= '/^api_key=/ {print $2}' /stage/scripts/qdrant-credentials.log)"

Step 5: Server Components

Component	Version	Purpose
Qdrant	1.16.0 (printed by `qdrant --version`)	Rust vector similarity search engine: HNSW index, REST and gRPC APIs, built in web dashboard
qdrant-web-ui	Pinned at build time (static assets under /var/lib/qdrant/static)	Single page app for the web dashboard; browse collections, run searches, view metrics
Ubuntu	22.04 LTS	Base operating system
systemd units	qdrant.service, qdrant-firstboot.service	Process supervision and first boot api_key generation

The qdrant binary is at /usr/local/bin/qdrant, installed from the official musl release tarball at github.com/qdrant/qdrant/releases. Runtime configuration is a YAML file at /etc/qdrant/config.yaml. The firstboot oneshot runs with Type=oneshot and RemainAfterExit=yes, gated by the sentinel file /var/lib/qdrant/.firstboot-done, so it runs exactly once per virtual machine.

Step 6: Filesystem Layout

Path	Owner	Purpose
/usr/local/bin/qdrant	root:root 0755	Qdrant server binary (statically compiled Rust musl executable)
/etc/qdrant/config.yaml	root:qdrant 0640	YAML config: ports, api_key, storage and snapshots paths, cluster flag
/etc/systemd/system/qdrant.service	root:root 0644	Qdrant server unit, WorkingDirectory=/var/lib/qdrant
/etc/systemd/system/qdrant-firstboot.service	root:root 0644	Oneshot that generates the api_key and rewrites config.yaml
/usr/local/sbin/qdrant-firstboot.sh	root:root 0750	Firstboot script (invoked by the oneshot)
/usr/local/sbin/qdrant-start.sh, qdrant-stop.sh, setEnv.sh	root:root 0755	Convenience wrappers around systemctl
/var/lib/qdrant/storage/	qdrant:qdrant 0750	Collection data: HNSW segments, vector blobs, payload storage
/var/lib/qdrant/snapshots/	qdrant:qdrant 0750	Snapshot files created via the /snapshots API
/var/lib/qdrant/static/	qdrant:qdrant 0755	qdrant-web-ui static assets; served at /dashboard/
/var/lib/qdrant/.firstboot-done	qdrant:qdrant 0644	Sentinel: presence means firstboot has already run
/stage/scripts/qdrant-credentials.log	root:root 0600	Generated api_key, readable by root only

Qdrant uses the local filesystem for all persistent state. Storage, snapshots, and the static assets for the web UI all live under /var/lib/qdrant. The systemd unit sets WorkingDirectory to /var/lib/qdrant so the server finds the static/ directory for the dashboard.

Step 7: Start, Stop, and Check Status

Use systemd directly:

sudo systemctl start qdrant
sudo systemctl stop qdrant
sudo systemctl restart qdrant
sudo systemctl status qdrant

Or the helper scripts:

sudo /usr/local/sbin/qdrant-start.sh
sudo /usr/local/sbin/qdrant-stop.sh

Tail the server logs via the journal:

sudo journalctl -u qdrant -n 50 --no-pager

The firstboot oneshot is separate. It runs exactly once on the customer's first boot and should show as active (exited) from then on:

sudo systemctl status qdrant-firstboot

Step 8: Verify the Server is Healthy

The HTTP /healthz endpoint returns a short text body when the server is ready. Authentication is required:

curl -H "api-key: $TOKEN" http://localhost:6333/healthz

Expected output:

healthz check passed

Or query the root endpoint for the running version:

curl -H "api-key: $TOKEN" http://localhost:6333/

Expected output (indented for readability):

{"title":"qdrant - vector search engine","version":"1.16.0","commit":"..."}

A successful response confirms the HTTP listener is bound, auth is enforced, and the query engine is initialised. The gRPC listener on 6334 is bound by the same process — if 6333 is healthy, 6334 is too.

Step 9: Open the Web Dashboard

The web dashboard is served from the HTTP API on port 6333 at the path /dashboard/ (note the trailing slash). It is a single page application that speaks the same REST API as any other client, and it prompts for the api_key on load.

From your workstation (once port 6333 is reachable from the subnet you are on):

http://<vm-ip>:6333/dashboard/

The dashboard surface includes a collections browser, an interactive search builder, a console for sending arbitrary REST requests, a cluster status page, and a metrics view. Paste the api_key into the key prompt; the browser caches it in session storage so you do not have to paste it every page load. For production use you should put the server behind a reverse proxy as described in Section 18 so the dashboard is served over TLS.

Step 10: Create a Collection

Before you can store vectors, create a collection and tell Qdrant the vector dimensionality and distance metric. The metric matches how your embedding model was trained: Cosine for most transformer based sentence embeddings, Euclidean for embeddings that preserve L2 distance, Dot for embeddings normalised such that dot product is the similarity you want.

curl -X PUT "http://localhost:6333/collections/articles" \
  -H "api-key: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"vectors":{"size":4,"distance":"Cosine"}}'

Expected output:

{"result":true,"status":"ok","time":0.27}

We use size 4 in this walkthrough so the examples fit on one screen. A realistic collection for sentence-transformers/all-MiniLM-L6-v2 uses size 384, OpenAI text-embedding-3-small uses 1536, Cohere embed-v3 uses 1024, and so on. Confirm the collection was created:

curl -H "api-key: $TOKEN" http://localhost:6333/collections

Expected output:

{"result":{"collections":[{"name":"articles"}]},"status":"ok","time":3.9e-06}

Inspect the full configuration — Qdrant exposes the HNSW parameters (m, ef_construct, full_scan_threshold), the optimiser thresholds, and the write-ahead log capacity — all tunable per collection:

curl -H "api-key: $TOKEN" http://localhost:6333/collections/articles | head -c 500

Step 11: Insert Vectors with Payloads

Points in Qdrant carry three things: an id, a vector, and an optional JSON payload. Payloads are indexed lazily and can be filtered during search. A realistic insert mixes texts, categories, timestamps, and any other metadata you might want to condition ranking on:

curl -X PUT "http://localhost:6333/collections/articles/points?wait=true" \
  -H "api-key: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "points": [
      {"id": 1, "vector": [0.10, 0.20, 0.30, 0.40], "payload": {"title": "Azure VM pricing",           "category": "cloud"}},
      {"id": 2, "vector": [0.15, 0.25, 0.35, 0.45], "payload": {"title": "GCP Compute Engine sizing",  "category": "cloud"}},
      {"id": 3, "vector": [0.80, 0.10, 0.05, 0.05], "payload": {"title": "Sourdough bread recipe",     "category": "cooking"}}
    ]
  }'

Expected output:

{"result":{"operation_id":1,"status":"completed"},"status":"ok","time":0.004}

The wait=true query parameter makes the request block until the write is durable and visible to subsequent queries. For high throughput batch loads, drop wait=true and batch points in groups of a few hundred to a few thousand per request.

In production, vectors come from your embedding pipeline rather than being typed by hand. The pattern for a 384 dimensional vector from a MiniLM model looks like:

{"id": 42, "vector": [0.023, -0.117, 0.058, -0.041, 0.091, ...380 more floats..., -0.006], "payload": {"title": "Kubernetes RBAC primer", "category": "infra"}}

Step 12: Search for Similar Vectors

Nearest-neighbour search ranks points in the collection by similarity to a query vector under the collection's distance metric. The query vector must be the same dimensionality as the collection:

curl -X POST "http://localhost:6333/collections/articles/points/search" \
  -H "api-key: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"vector":[0.10,0.20,0.30,0.40],"limit":3,"with_payload":true}'

Expected output:

{"result":[
  {"id":1,"version":1,"score":1.0,       "payload":{"title":"Azure VM pricing",          "category":"cloud"}},
  {"id":2,"version":1,"score":0.9979655, "payload":{"title":"GCP Compute Engine sizing", "category":"cloud"}},
  {"id":3,"version":1,"score":0.3045457, "payload":{"title":"Sourdough bread recipe",    "category":"cooking"}}
],"status":"ok","time":0.00089}

The query vector is identical to point 1, so point 1 scores 1.0 (perfect Cosine match). Point 2 has a very similar vector and scores 0.998. Point 3 points in a different direction and scores 0.30 — the bread recipe is near the bottom of results for a cloud query, which is what you want.

Combine similarity with payload filters during graph traversal (not as a post filter), so selective queries stay fast. Restrict to the cloud category:

curl -X POST "http://localhost:6333/collections/articles/points/search" \
  -H "api-key: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.10, 0.20, 0.30, 0.40],
    "limit": 5,
    "with_payload": true,
    "filter": {"must": [{"key": "category", "match": {"value": "cloud"}}]}
  }'

Expected output:

{"result":[
  {"id":1,"version":1,"score":1.0,       "payload":{"title":"Azure VM pricing",          "category":"cloud"}},
  {"id":2,"version":1,"score":0.9979655, "payload":{"title":"GCP Compute Engine sizing", "category":"cloud"}}
],"status":"ok","time":0.00088}

Only the two cloud articles come back; the bread recipe is filtered out. Filter conditions compose with must, should, and must_not, and support range, match, geo, and nested-key conditions — see the Qdrant REST API reference for the full grammar.

Step 13: Connect from Python with the qdrant-client SDK

Most AI and ML applications use Python. The official qdrant-client library works identically against the REST or gRPC endpoint; gRPC is preferred in production for throughput. Install once on your application host:

pip install qdrant-client

Then connect and search:

from qdrant_client import QdrantClient
from qdrant_client.http.models import PointStruct, Filter, FieldCondition, MatchValue

client = QdrantClient(
    host="<vm-ip>",
    port=6333,        # HTTP; use grpc_port=6334 and prefer_grpc=True for gRPC
    api_key="<your-token>",
)

# Upsert
client.upsert(
    collection_name="articles",
    points=[
        PointStruct(id=1, vector=[0.10, 0.20, 0.30, 0.40], payload={"title": "Azure VM pricing",          "category": "cloud"}),
        PointStruct(id=2, vector=[0.15, 0.25, 0.35, 0.45], payload={"title": "GCP Compute Engine sizing", "category": "cloud"}),
        PointStruct(id=3, vector=[0.80, 0.10, 0.05, 0.05], payload={"title": "Sourdough bread recipe",    "category": "cooking"}),
    ],
    wait=True,
)

# Search with filter
results = client.search(
    collection_name="articles",
    query_vector=[0.10, 0.20, 0.30, 0.40],
    limit=5,
    query_filter=Filter(must=[FieldCondition(key="category", match=MatchValue(value="cloud"))]),
    with_payload=True,
)
for r in results:
    print(r.id, r.score, r.payload["title"])

For retrieval augmented generation, pair this with your embedding provider of choice:

# OpenAI embedding + Qdrant retrieval
from openai import OpenAI
openai = OpenAI()
query_text = "How do I right-size an Azure VM for vector search?"
vec = openai.embeddings.create(model="text-embedding-3-small", input=query_text).data[0].embedding
context = client.search(collection_name="articles", query_vector=vec, limit=5, with_payload=True)

Client SDKs in Go, Rust, Java, .NET, and JavaScript follow the same shape. All of them default to gRPC on port 6334 unless you explicitly construct them for HTTP.

Step 14: Rotate the API Key

The api_key does not expire on its own. Rotate the firstboot-generated key before production, and then on whatever schedule your security policy requires. Rotation is a single config edit plus a service restart:

# 1. Generate a new key (or bring your own from your secrets manager)
NEW_KEY="$(openssl rand -hex 32)"
echo "New key: $NEW_KEY"

# 2. Rewrite config.yaml
sudo sed -i "s/^\(\s*api_key:\s*\).*/\1$NEW_KEY/" /etc/qdrant/config.yaml

# 3. Restart the server so the new key takes effect
sudo systemctl restart qdrant

# 4. Update /stage/scripts/qdrant-credentials.log so future admin scripts see the new value
sudo sed -i "s/^api_key=.*/api_key=$NEW_KEY/" /stage/scripts/qdrant-credentials.log

# 5. Verify: old key returns 401, new key returns 200
sudo systemctl status qdrant | head -5

Every client that cached the old key needs its configuration updated; do the rollout in a window where you can wait for all application servers to pick up the new value, or put a reverse proxy in front that injects the key from a secrets manager so clients never see it directly.

Step 15: Snapshots and Backup

Qdrant snapshots are point-in-time copies of a collection's state, written into the snapshots directory. Trigger one via the API:

curl -X POST "http://localhost:6333/collections/articles/snapshots" \
  -H "api-key: $TOKEN"

Expected output (abbreviated):

{"result":{"name":"articles-abc123.snapshot","creation_time":"2026-04-19T18:00:00"},"status":"ok","time":0.1}

Snapshots live under /var/lib/qdrant/snapshots//. Copy them off-box to your preferred backup store (Azure Blob Storage, S3, GCS). List existing snapshots:

curl -H "api-key: $TOKEN" http://localhost:6333/collections/articles/snapshots

Restore by pointing qdrant at a snapshot file at startup (the CLI exposes --snapshot <path>:<collection>); this is usually scripted into a DR runbook rather than run interactively.

Step 16: Reconfigure at Runtime

Qdrant's runtime configuration is the YAML file at /etc/qdrant/config.yaml. Edit in place, then restart:

sudo vi /etc/qdrant/config.yaml
sudo systemctl restart qdrant

The most common reconfigurations are:

service.api_key — rotate the api_key (see Section 14)
service.host — change the bind address (leave as 0.0.0.0 unless you want to bind only to a specific IP)
storage.on_disk_payload — keep payloads in RAM (false, faster, uses more memory) or on disk (true, the default on this image, lower memory)
log_level — INFO by default; set to DEBUG for troubleshooting and back for production

To change anything systemd itself controls (user, ulimit, resource constraints, additional env vars) use a drop in override rather than editing the shipped unit:

sudo systemctl edit qdrant

An editor opens on an empty drop in file at /etc/systemd/system/qdrant.service.d/override.conf. Add only the sections you want to change, for example:

[Service]
LimitNOFILE=131072
Environment="RUST_LOG=debug"

Save and exit; systemd reloads and applies the override on the next restart.

Step 17: Move Data to an Attached Premium Disk

For production workloads larger than the OS disk can comfortably hold, attach a Premium SSD data disk and move the Qdrant data directory across.

# 1. Identify the new block device
lsblk

# 2. Format and mount it
sudo mkfs.ext4 /dev/sdc
sudo mkdir -p /data
sudo mount /dev/sdc /data
sudo blkid /dev/sdc  # copy the UUID into /etc/fstab for persistence

# 3. Stop Qdrant and move the data directory
sudo systemctl stop qdrant
sudo rsync -aAX /var/lib/qdrant/ /data/qdrant/
sudo chown -R qdrant:qdrant /data/qdrant
sudo mv /var/lib/qdrant /var/lib/qdrant.bak

# 4. Point config.yaml at the new location
sudo sed -i 's|storage_path: /var/lib/qdrant/storage|storage_path: /data/qdrant/storage|'     /etc/qdrant/config.yaml
sudo sed -i 's|snapshots_path: /var/lib/qdrant/snapshots|snapshots_path: /data/qdrant/snapshots|' /etc/qdrant/config.yaml

# 5. Update WorkingDirectory so the dashboard keeps working
sudo systemctl edit qdrant --full  # change WorkingDirectory= line to /data/qdrant

# 6. Restart and verify
sudo systemctl start qdrant
sudo journalctl -u qdrant -n 50 --no-pager

Only after the server has started cleanly, queries return correct results, and the dashboard loads should you delete /var/lib/qdrant.bak.

Step 18: Troubleshooting

Cannot connect on 6333 or 6334 from another host

Verify the server is bound to 0.0.0.0 rather than 127.0.0.1: sudo ss -tlnp | grep -E ':(6333|6334) '. Both ports should show 0.0.0.0. If only 127.0.0.1 is shown, check that service.host: 0.0.0.0 is set in /etc/qdrant/config.yaml and restart. If the server is bound correctly, the problem is your Azure NSG; confirm inbound TCP 6333 and 6334 are allowed from the caller's CIDR.

Request returns 401 "Must provide an API key or an Authorization bearer token"

The api_key is missing, malformed, or the wrong value. Re-read /stage/scripts/qdrant-credentials.log to confirm the current key, and be sure the api-key: <value> header is present on every request. The key is a 64 character hex string; copying only part of it is a common mistake.

Collection create returns {"status":{"error":"..."}}

The collection already exists, the vector size conflicts with a previous definition, or the JSON body is malformed. Delete the collection and recreate it with the new definition:

curl -X DELETE "http://localhost:6333/collections/articles" -H "api-key: $TOKEN"

Search returns zero results or lower scores than expected

Three usual causes. First, dimensionality mismatch between the collection and the query vector — both must match exactly. Second, distance metric mismatch between how the vectors were embedded and how the collection was created; Cosine on raw unnormalised embeddings will not do what you expect. Third, the write has not yet been flushed to the HNSW index; writes are durable immediately but query performance degrades briefly until the indexer catches up. For workloads under 10000 points per collection, the indexer is in "full scan" mode anyway and freshness is immediate.

Memory grows faster than expected

The HNSW index lives in memory. The rough memory cost is num_vectors * vector_dim * 4 bytes + a small overhead for the graph edges. A collection with 1 million 768 dimensional vectors needs roughly 3 GB for vectors and another 300 MB for the HNSW graph. If memory is a concern, set on_disk: true on the HNSW config and memmap_threshold to a sensible value so cold segments spill to disk.

Dashboard shows blank page after correct api_key entry

Clear browser session storage for the site and reload. If this persists, check sudo journalctl -u qdrant -n 200 --no-pager for errors about static asset paths; the dashboard assets live at /var/lib/qdrant/static/ and must be readable by the qdrant user.

Qdrant refuses to start after a config edit

Run sudo systemctl status qdrant and look at the last journal lines. A common cause is YAML syntax in config.yaml that makes it unparseable; fix the file (beware tabs vs spaces) and systemctl restart qdrant.

Step 19: Security Recommendations

Never run this server without api_key authentication. The key is generated at firstboot and authentication is enforced from that moment; rotating the key before production is recommended but the auth guard is already active out of the box.

Restrict ports 6333 and 6334 in the NSG to your application subnets. The public internet has no business reaching an unproxied vector store.

Terminate TLS in front of the server by installing nginx or Caddy on the same VM listening on 443, forwarding to 127.0.0.1:6333 (and 127.0.0.1:6334 for gRPC via HTTP/2 if clients need it), and letting Certbot or Caddy's automatic TLS keep a valid certificate in place. The Qdrant HTTP listener is plain HTTP; do not expose it directly to anything outside the VM over a network you do not trust.

Use separate api_keys per application wherever possible. Qdrant 1.16 supports a single api_key on the server. For per-application credentials, place a reverse proxy in front that injects the shared key and authenticates each application with its own credential (basic auth, mTLS, OAuth2 introspection). This is the standard way to layer per-application access control on top of Qdrant Community edition.

Rotate the api_key periodically. There is no automatic rotation; put it on your security calendar.

Keep the system patched. The base Ubuntu 22.04 APT sources stay configured; sudo apt-get update && sudo apt-get upgrade regularly to pick up security updates. Upstream Qdrant and qdrant-web-ui versions are pinned at image build time; to upgrade, redeploy from a newer cloudimg gallery image version.

Step 20: Support and Licensing

cloudimg provides 24/7/365 expert technical support for this image. Guaranteed response within 24 hours; one hour average for critical issues. Contact support@cloudimg.co.uk.

Qdrant is licensed under the Apache 2.0 licence by Qdrant Solutions GmbH. qdrant-web-ui is licensed under the Apache 2.0 licence by the same publisher. This image is a repackaged upstream distribution provided by cloudimg; additional charges apply for the pre configured image, ongoing maintenance, and the 24/7 support contract.

Visit www.cloudimg.co.uk/guides/qdrant-1-16-on-ubuntu-22-04-azure for the published version of this guide.

Qdrant is a trademark of Qdrant Solutions GmbH. This image is a repackaged upstream distribution provided by cloudimg. Additional charges apply for build, maintenance, and 24/7 support.