Running your own local AI with Ollama
Run Your Own Local AI with Ollama + Open WebUI on Proxmox
Ever wanted to have your own local AI assistant running right from your homelab?
Instead of relying on cloud services, you can set up a lightweight yet powerful AI environment inside Proxmox.
In this guide, I’ll show you how I deployed Ollama (for models) and Open WebUI (for the interface) in separate LXC containers. This way, you get a clean modular setup that works even on modest hardware.
My Setup
-
Hypervisor: Proxmox VE
-
Container 1 (Ollama): Debian 13, 2 cores, 4GB RAM, 8GB swap
-
Container 2 (Open WebUI): Debian 11, 1 core, 1GB RAM, 2GB swap (optional)
Step 1: Deploy Ollama
Inside the first LXC container (Debian 13, 4GB RAM, 2 cores, 8GB swap):
Pull some lightweight models to test:
Arena comes preinstalled by default in Ollama.
Models I Installed
Here’s what I currently have on my Ollama LXC:
-
Phi-3 Mini (Q4) → good reasoning while still lightweight
-
Llama 3.2 (quantized) → higher accuracy, heavier
-
Gemma 3 270M → ultra-light, runs even on very small RAM
-
TinyLlama latest → small but useful for experiments
-
Arena → comes by default with Ollama for benchmarking/chat
This mix gives me flexibility:
-
Use Gemma 3 270M or TinyLlama on very low resources.
-
Use Phi-3 Mini for balanced reasoning.
-
Use Llama 3.2 for better accuracy (but slower).
Step 2: Deploy Open WebUI
Inside the second container (Debian 11, 1GB RAM, 1 core):
Optional: enable swap to give it extra breathing room:
Step 3: Connect Open WebUI to Ollama
-
On the Ollama container, check if the API is running:
-
On the Open WebUI container, configure the backend to point to Ollama’s IP:
Now the frontend can talk to the backend 🎉
Architecture Diagram
Sample
Conclusion
By running Ollama and Open WebUI on separate LXC containers, I now have a local AI setup that’s:
-
Lightweight (optimized for low resources)
-
Modular (frontend and backend separated)
-
Private (runs entirely in my Proxmox homelab)
This setup lets me experiment with models like Phi-3 Mini, Llama 3.2, Gemma 3 270M, and TinyLlama while keeping my system stable.
It’s a simple way to get started with local AI on homelab hardware without needing a GPU or big resources.
Comments
Post a Comment