LocalAI, the free open source drop in replacement for the OpenAI API that runs local inference on your own hardware with no GPU required, preinstalled as a system service with a ready to use model, the OpenAI compatible REST API and the built in web UI published behind an authenticating nginx proxy. A unique API key and admin password are generated
Overview
LocalAI is a free, open source, self hosted alternative to the OpenAI API. It exposes the same OpenAI compatible REST API for chat completions, completions, embeddings, audio and image generation, but runs entirely on your own infrastructure with no external API calls and no GPU required. This image delivers LocalAI fully installed and configured as a system service, with a small instruct model pre pulled so the chat API and the web interface work the moment the instance boots.
Application Stack
The LocalAI single binary installed under /opt/localai and run by a dedicated unprivileged service account. The models directory stored on a dedicated data disk so your downloaded models are independently resizable and survive instance replacement. A systemd service that starts LocalAI on boot and restarts it on failure. An nginx reverse proxy that publishes the web UI and the OpenAI compatible API on port 80 behind HTTP Basic authentication.
OpenAI Compatible API
Point any OpenAI client library or tool at the instance and change only the base URL. LocalAI serves the chat completions, completions, embeddings, models and other OpenAI compatible endpoints, so existing applications work unchanged against models running on your own hardware. A small CPU friendly instruct model is pre pulled and ready, and you can install hundreds more from the built in gallery through the web UI or the model apply API.
Secure First Boot
On the first boot of your instance a one shot service generates a fresh LocalAI API key and a fresh admin password, both unique to that instance, wires the API key into the runtime and the password into the nginx credentials file, and writes them to a root only file. The inference server itself binds to loopback only and is never exposed without authentication. No shared or default credentials ship in the image.
Ready To Use
The web UI is served on port 80 through nginx. Sign in with the generated administrator credentials to chat with the pre pulled model, browse and install gallery models, and inspect the running backends. The OpenAI compatible API is served on the same port behind the same login plus the generated API key as a bearer token.
cloudimg Support
24/7 technical support by email and chat. Help with deployment, model installation and configuration, the OpenAI compatible API, prompt templates, embeddings, the model gallery, TLS and performance tuning.
Use Cases
A private, self hosted drop in for the OpenAI API. Running local language models with no data leaving your infrastructure. Powering chat, retrieval augmented generation and agent applications on your own hardware. Generating embeddings for search and similarity. Air gapped and compliance constrained inference.
All product and company names are trademarks or registered trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.