Ollama | Support by cloudimg

Machine Learning Free Trial Available

Overview

Ollama preinstalled for AWS with NVIDIA GPU acceleration. The easiest way to run open large language models (Llama, Mistral, Gemma, Qwen, DeepSeek) on Ubuntu 24.04 behind an nginx reverse proxy, gated by a unique password generated on first boot. Backed by 24/7 cloudimg support.

Description

## Ollama by cloudimg

Ollama is the easiest way to run open large language models locally. It downloads, quantizes and serves models such as Llama, Mistral, Gemma, Phi, Qwen and DeepSeek with a single command, exposing a REST API that is also OpenAI chat-completions compatible. This Amazon Machine Image delivers Ollama fully installed as a system service on an NVIDIA GPU instance, so a private, self-hosted LLM endpoint is running within minutes of launch. The release available is Ollama 0.30.

## GPU Accelerated

The NVIDIA datacenter driver is preinstalled and verified on real hardware, and Ollama auto-detects the GPU to offload model inference. Launch on a g4dn, g5 or g6 instance and your models run on the GPU out of the box.

## Secure First Boot

Ollama ships with no built-in authentication, so access is gated by HTTP Basic Authentication at an nginx reverse proxy, with a unique password generated for every instance on first boot and written to a root only file. No shared or default credentials ship in the image.

## Ready To Use

Pull a model, chat from the CLI, or call the REST and OpenAI-compatible endpoints from LangChain, LlamaIndex or any OpenAI SDK. A small starter model is pre-pulled and model weights live on a dedicated, resizable volume.

## cloudimg Support

cloudimg provides 24/7 technical support for this image, covering deployment, model selection, GPU sizing, quantization, the OpenAI-compatible API, TLS and scaling.

Key Features

  • Ollama, the easiest way to run open LLMs (Llama, Mistral, Gemma, Qwen, DeepSeek), preinstalled as a systemd service behind an nginx reverse proxy on Ubuntu 24.04 with an OpenAI-compatible REST API
  • GPU accelerated: NVIDIA datacenter driver preinstalled and verified on real hardware, with model inference offloaded to the GPU out of the box on g4dn, g5 and g6 instances
  • Secure by default: HTTP Basic Authentication with a unique password generated for every instance on first boot, plus 24/7 cloudimg support

Related Technologies

ollama llm local ai llama mistral gemma qwen deepseek openai compatible gpu cloudimg

Deploy on AWS

Launch this pre-configured AMI on AWS with 24/7 support from cloudimg.

View on AWS Marketplace

24/7 Support Included

Email: support@cloudimg.co.uk

Phone: (+44) 0333 006 4730

Product Details

Category
Machine Learning
Support
24/7, 365 days/year
Platform
AWS (Amazon Web Services)
Last Updated
2026-06-09