Differences

This shows you the differences between two versions of the page.

--- services:open-webui [2025/10/15 11:51] – willy
+++ services:open-webui [2026/03/03 08:56] (current) – willy
@@ Line 6: / Line 6: @@
 both can be installed with a container, easily.
+A good NVIDIA GPU is strongly recommended, as inference will be significantly faster (up to 12 times in my experience) compared to just using CPU. While using CPU only works, the user experience is pretty slow, not real time and you cannot really have any dialogue. If you don't have a GPU, go ahead, everything will still work just slower.
+Intel GPUs and AMD GPUs are supposed to be supported as well, but i have an NVIDIA GPU so this is what i will be describing in this page.
 ===== Installation =====
@@ Line 14: / Line 19: @@
 <code bash>
 useradd -d /data/daemons/openwebui -m openwebui
+usermod -G video openwebui
 mkdir /data/llm
 chown openwebui:openwebui /data/llm
@@ Line 24: / Line 30: @@
 </code>
-Open WebUI can be installed on bare metal, without containers, using //pip//, but due to strict python requirement (3.11 at the time of writing this), this is not recomended (Gentoo has already Python 3.13).
+Adding the user to the **video** group is required for accessing GPU, both if using a container or not.
-Let's go with the containers way, using of course Podman compose.
+Open WebUI can be installed on bare metal, without containers, using //pip//, but due to strict python requirement (3.11 at the time of writing this), this is not recommended (Gentoo has already Python 3.13), and maintenance would be a pity since updates are almost daily.
+Let's go with the containers way, using of course rootless podman compose.
 From [[https://docs.openwebui.com/getting-started/quick-start/|this page]], select "docker compose" and
@@ Line 46: / Line 54: @@
     ports:
       - 3081:11434
+    devices:
+      - nvidia.com/gpu=all # required for GPU acceleration
+    annotations:
+      run.oci.keep_original_groups: "true" # required for GPU acceleration
     volumes:
       - /data/llm/ollama/code:/code
       - /data/llm/ollama/ollama:/root/.ollama
     container_name: ollama
-    pull_policy: always
+#    pull_policy: always
     tty: true
     environment:
@@ Line 63: / Line 75: @@
 </file>
 this setup will pull in the same container setup both Ollama and Open WebUI. This allows for a seamless integration and neat organization in the server itself.
-This setup will let you access your Ollama instance from //outside// the container, on port 3081, which should **NOT** pe forwarded on the proxy server, because it's only for home access. The Open WebUI instance will instead be available on port 3080 and accessible trough web proxy, see below.
+This setup will let you access your Ollama instance from //outside// the container, on port 3081, which should **NOT** be forwarded on the proxy server, because it's only for home access. The Open WebUI instance will instead be available on port 3080 and accessible trough web proxy, see below. You can still use Ollama on the server for other services, just do not export trough the proxy for external use, it would be unprotected.
+===== GPU acceleration support =====
+=== Install NVIDIA drivers & tools ===
+Enable NVIDIA card by adding this line:
+<file - /etc/portage/make.conf>
+VIDEO_CARDS="intel nvidia"
+</file>
+(of course, put the cards you have, i have both an Intel and an NVIDIA). This step is probably not needed on an headless server, but having it defined will ensure that in the future it could be used.
+Then disable the NVIDIA GUI tools, since the server is headless, put this into  **/etc/portage/package.use/nvidia**:
+<file - nvidia>
+x11-drivers/nvidia-drivers -tools
+</file>
+Now emerge the required packages:
+<code bash>
+emerge -vp x11-drivers/nvidia-drivers app-containers/nvidia-container-toolkit
+</code>
+the **nvidia-drivers** is the actual driver, and **nvidia-container-toolkit** contains all the required files and stuff to enable passing the GPU to the container. More info can be found [[https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html|here]].
+Now, check that the GPU is detected:
+<code bash>
+nvidia-smi
+Mon Mar  2 16:34:45 2026
+[ ... lots of output with your GPU info, VRAM, etc... ]
+</code>
+=== Configure NVIDIA tools ===
+Disable cgroups (won't work for rootless podman) by editing the file /etc/nvidia-container-runtime/config.toml and set the property no-cgroups to true:
+<file>
+[nvidia-container-cli]
+...
+no-cgroups = true
+...
+</file>
+leave the rest of the file untouched.
+You need to generate a Common Device Interface (CDI) file which Podman will use to talk to the GPU (see [[https://podman-desktop.io/docs/podman/gpu|here]]):
+<code bash>
+nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
+</code>
+you will need to **run again** the above command every time the NVIDIA drivers are updated.
+At this point you should check the CDI is in place and working:
+<code bash>
+> nvidia-ctk cdi list
+INFO[0000] Found 3 CDI devices
+nvidia.com/gpu=0
+nvidia.com/gpu=GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
+nvidia.com/gpu=all
+</code>
+=== Configure podman passtrough ===
+To support GPU acceleration you need the two lines indicated in the compose file above.
+This one:
+<code>
+    devices:
+      - nvidia.com/gpu=all # required for GPU acceleration
+</code>
+tells podman to pass all the GPUs to the container. You can actually select which one (if you have more than one) by selecting the appropriate one in the output of:
+<code bash>
+nvidia-ctk cdi list
+INFO[0000] Found 3 CDI devices
+nvidia.com/gpu=0
+nvidia.com/gpu=GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
+nvidia.com/gpu=all
+</code>
+This line instead:
+<code>
+    annotations:
+      run.oci.keep_original_groups: "true" # required for GPU acceleration
+</code>
+is required because the container will forget the additional groups (of which **video** is required to access the GPU), and this annotation passes to the container the additional groups as well.
+=== Test GPU in container ===
+After restarting the container, this commans (as openwebui user) will tell you that all is well:
+<code bash>
+su - openwebui
+podman exec -it ollama nvidia-smi
+[ ... output similar to above ... ]
+</code>
@@ Line 87: / Line 192: @@
                 proxy_set_header  X-Script-Name /;
                 proxy_set_header  Host $http_host;
+                proxy_http_version 1.1;
+                proxy_buffering off;
+                proxy_set_header Upgrade $http_upgrade;
+                proxy_set_header Connection $connection_upgrade;
+                proxy_set_header X-Real-IP $remote_addr;
+                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+                proxy_set_header X-Accel-Internal /internal-nginx-static-location;
+                access_log off;
         }
         include com.mydomain/certbot.conf;
 }

Willy's Wiki

User Tools

Differences