LocalAI 4.4.3 (MIT) on Ubuntu 24.04 LTS by cloudimg - the free, self-hosted drop-in replacement for the OpenAI API that runs local inference on commodity CPU with no GPU required. Pre-installed as a systemd service with a ready-to-use model, the OpenAI-compatible REST API and the built-in web UI behind an authenticating nginx proxy on port 80. Per-VM admin password on first boot; models on a dedicated Azure data disk. 24/7 cloudimg support.
## LocalAI on Ubuntu 24.04 LTS by cloudimg
LocalAI is a free, open-source, self-hosted drop-in replacement for the OpenAI API. It exposes the same OpenAI-compatible REST API for chat completions, completions and embeddings, but runs entirely on your own infrastructure with no external API calls and no GPU required, using the llama.cpp GGUF backend for CPU inference. The cloudimg image installs LocalAI 4.4.3 as a systemd service behind an nginx reverse proxy on port 80 with HTTP Basic auth, pre-pulls a small CPU-friendly instruct model so the chat API and web UI work the moment the VM boots, stores models on a dedicated Azure data disk, and generates a unique admin password on the first boot of every VM. Backed by 24/7 expert support.
OpenAI-Compatible API
Point any OpenAI client at the VM and change only the base URL. LocalAI serves chat completions, completions, embeddings and models, so existing applications work unchanged against models on your own hardware. A small instruct model (SmolLM2 135M) is pre-pulled and ready; install hundreds more from the built-in gallery.
CPU Inference, No GPU
Local inference runs on commodity CPU via the llama.cpp GGUF backend - no GPU required. Recommended size Standard_B4ms (4 vCPU / 16 GiB); scale up for larger models.
Dedicated Data Disk
The models directory lives on a dedicated, independently resizable Azure data disk mounted at /var/lib/localai, separate from the OS disk and re-provisioned with every VM.
Secure First Boot
A unique admin password and API key are generated on first boot and written to a root-only file. The inference server binds to loopback only and is never exposed without authentication; no shared credentials ship in the image.
Why Choose cloudimg?
* 24/7 Expert Support with guaranteed 24 hour response. Contact support@cloudimg.co.uk
* Production Ready from Launch Pre configured, security patched, and validated before publication
* Azure Native Integration Built with Azure Linux Agent, cloud init, and Gen2 Hyper V
What is Included
* LocalAI 4.4.3 (binary /usr/local/bin/local-ai) run by a dedicated localai system user
* The OpenAI-compatible REST API and built-in web UI on port 80 behind nginx HTTP Basic auth
* A pre-pulled SmolLM2 135M instruct model + the cpu-llama-cpp backend
* A dedicated Azure data disk at /var/lib/localai for the models
* A per-VM admin password and API key generated on first boot in a root-only file
* localai.service and nginx.service as systemd units, enabled and active
Use Cases
A self-hosted OpenAI-compatible API, private LLM inference with no data egress, embeddings for RAG, and a drop-in backend for existing OpenAI client applications.
LocalAI serves plain HTTP on port 80 - front it with TLS and your own domain before production.
Visit www.cloudimg.co.uk/guides/localai-on-ubuntu-24-04-azure for the full user guide.
LocalAI is a trademark of its respective owner. This image is produced by cloudimg and is not affiliated with or endorsed by the LocalAI project. LocalAI is distributed under the MIT License. All trademarks are the property of their respective holders.