Running your own local AI with Ollama

 

Run Your Own Local AI with Ollama + Open WebUI on Proxmox

Ever wanted to have your own local AI assistant running right from your homelab?
Instead of relying on cloud services, you can set up a lightweight yet powerful AI environment inside Proxmox.

In this guide, I’ll show you how I deployed Ollama (for models) and Open WebUI (for the interface) in separate LXC containers. This way, you get a clean modular setup that works even on modest hardware.





My Setup

  • Hypervisor: Proxmox VE

  • Container 1 (Ollama): Debian 13, 2 cores, 4GB RAM, 8GB swap

  • Container 2 (Open WebUI): Debian 11, 1 core, 1GB RAM, 2GB swap (optional)


Step 1: Deploy Ollama

Inside the first LXC container (Debian 13, 4GB RAM, 2 cores, 8GB swap):

curl -fsSL https://ollama.com/install.sh | sh

Pull some lightweight models to test:

ollama pull phi3 ollama pull llama3.2 ollama pull gemma3:270m ollama pull tinyllama

Arena comes preinstalled by default in Ollama.


Models I Installed

Here’s what I currently have on my Ollama LXC:

  • Phi-3 Mini (Q4) → good reasoning while still lightweight

  • Llama 3.2 (quantized) → higher accuracy, heavier

  • Gemma 3 270M → ultra-light, runs even on very small RAM

  • TinyLlama latest → small but useful for experiments

  • Arena → comes by default with Ollama for benchmarking/chat

This mix gives me flexibility:

  • Use Gemma 3 270M or TinyLlama on very low resources.

  • Use Phi-3 Mini for balanced reasoning.

  • Use Llama 3.2 for better accuracy (but slower).

For more models visit: https://ollama.com/

Step 2: Deploy Open WebUI

Inside the second container (Debian 11, 1GB RAM, 1 core):

curl -fsSL https://get.openwebui.com | bash

Optional: enable swap to give it extra breathing room:

fallocate -l 2G /swapfile chmod 600 /swapfile mkswap /swapfile swapon /swapfile

Step 3: Connect Open WebUI to Ollama

  1. On the Ollama container, check if the API is running:

    systemctl status ollama
  2. On the Open WebUI container, configure the backend to point to Ollama’s IP:

    http://<ollama-container-ip>:11434

Now the frontend can talk to the backend 🎉


Architecture Diagram



Sample








Conclusion

By running Ollama and Open WebUI on separate LXC containers, I now have a local AI setup that’s:

  • Lightweight (optimized for low resources)

  • Modular (frontend and backend separated)

  • Private (runs entirely in my Proxmox homelab)

This setup lets me experiment with models like Phi-3 Mini, Llama 3.2, Gemma 3 270M, and TinyLlama while keeping my system stable.

It’s a simple way to get started with local AI on homelab hardware without needing a GPU or big resources.

Comments

Popular posts from this blog

Suricata on Mikrotik(IDS+IPS) = Part 4 - Configuration of the IPS Part

Why upload comes first before download

Suricata on Mikrotik(IDS+IPS) = Part 3 - Configuration of the IDS Part