Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| services:open-webui [2026/03/03 07:08] – willy | services:open-webui [2026/03/03 08:56] (current) – willy | ||
|---|---|---|---|
| Line 7: | Line 7: | ||
| both can be installed with a container, easily. | both can be installed with a container, easily. | ||
| - | A good NVIDIA GPU is strongly recommended, | + | A good NVIDIA GPU is strongly recommended, |
| Intel GPUs and AMD GPUs are supposed to be supported as well, but i have an NVIDIA GPU so this is what i will be describing in this page. | Intel GPUs and AMD GPUs are supposed to be supported as well, but i have an NVIDIA GPU so this is what i will be describing in this page. | ||
| Line 32: | Line 32: | ||
| Adding the user to the **video** group is required for accessing GPU, both if using a container or not. | Adding the user to the **video** group is required for accessing GPU, both if using a container or not. | ||
| - | Open WebUI can be installed on bare metal, without containers, using //pip//, but due to strict python requirement (3.11 at the time of writing this), this is not recomended | + | Open WebUI can be installed on bare metal, without containers, using //pip//, but due to strict python requirement (3.11 at the time of writing this), this is not recommended |
| - | Let's go with the containers way, using of course | + | Let's go with the containers way, using of course |
| From [[https:// | From [[https:// | ||
| Line 56: | Line 56: | ||
| devices: | devices: | ||
| - nvidia.com/ | - nvidia.com/ | ||
| + | annotations: | ||
| + | run.oci.keep_original_groups: | ||
| volumes: | volumes: | ||
| - / | - / | ||
| Line 73: | Line 75: | ||
| </ | </ | ||
| - | this setup will pull in the same container setup both Ollama and Open WebUI. This allows for a seamless integration and neat organization in the server itself. | + | this setup will pull in the same container setup both Ollama and Open WebUI. This allows for a seamless integration and neat organization in the server itself. |
| - | This setup will let you access your Ollama instance from //outside// the container, on port 3081, which should **NOT** | + | This setup will let you access your Ollama instance from //outside// the container, on port 3081, which should **NOT** |
| - | |||
| - | ===== Reverse Proxy ===== | ||
| - | |||
| - | Open WebUI can be hosted on subdomain, let's assume you choose **ai.mydomain.com**. | ||
| - | |||
| - | As usual you want it protected by the Reverse Proxy, so create the **ai.conf** file: | ||
| - | <file - ai.conf> | ||
| - | server { | ||
| - | server_name ai.mydomain.com; | ||
| - | listen 443 ssl; | ||
| - | listen 8443 ssl; | ||
| - | http2 on; | ||
| - | |||
| - | access_log / | ||
| - | error_log / | ||
| - | |||
| - | location / { # The trailing / is important! | ||
| - | proxy_pass | ||
| - | proxy_set_header | ||
| - | proxy_set_header | ||
| - | proxy_http_version 1.1; | ||
| - | proxy_buffering off; | ||
| - | proxy_set_header Upgrade $http_upgrade; | ||
| - | proxy_set_header Connection $connection_upgrade; | ||
| - | proxy_set_header X-Real-IP $remote_addr; | ||
| - | proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | ||
| - | proxy_set_header X-Accel-Internal / | ||
| - | access_log off; | ||
| - | } | ||
| - | include com.mydomain/ | ||
| - | } | ||
| - | </ | ||
| - | add this config file to NGINX (see [[selfhost: | ||
| - | |||
| - | Now go with browser to **https:// | ||
| ===== GPU acceleration support ===== | ===== GPU acceleration support ===== | ||
| - | While you can run models using **only** your CPU, the end result will be a fairly annoying experience with very long response times. Basic interactions can take up to minutes (not kidding!) and having any kind of quick reply or even conversation can be frustrating. | + | === Install NVIDIA |
| - | + | ||
| - | Luckly, you can improve up to a 10x factor (or more!) your response times by using a GPU. I have, on my server, an NVIDIA GA104GL [RTX A4000], which provides 16GB of VRAM and decent acceleration for the AI task. I didn't purchase this card on purpose, but i happened to have it from an existing gaming pc. | + | |
| - | + | ||
| - | To enable GPU acceleration, | + | |
| - | + | ||
| - | So, let's do it. | + | |
| - | + | ||
| - | === Install NVIDIA | + | |
| Enable NVIDIA card by adding this line: | Enable NVIDIA card by adding this line: | ||
| Line 129: | Line 88: | ||
| VIDEO_CARDS=" | VIDEO_CARDS=" | ||
| </ | </ | ||
| - | (of course, put the cards you have, i hve both an Intel and an NVIDIA) | + | (of course, put the cards you have, i have both an Intel and an NVIDIA). This step is probably not needed on an headless server, but having it defined will ensure that in the future it could be used. |
| Then disable the NVIDIA GUI tools, since the server is headless, put this into **/ | Then disable the NVIDIA GUI tools, since the server is headless, put this into **/ | ||
| Line 140: | Line 99: | ||
| emerge -vp x11-drivers/ | emerge -vp x11-drivers/ | ||
| </ | </ | ||
| + | |||
| + | the **nvidia-drivers** is the actual driver, and **nvidia-container-toolkit** contains all the required files and stuff to enable passing the GPU to the container. More info can be found [[https:// | ||
| Now, check that the GPU is detected: | Now, check that the GPU is detected: | ||
| Line 145: | Line 106: | ||
| nvidia-smi | nvidia-smi | ||
| Mon Mar 2 16:34:45 2026 | Mon Mar 2 16:34:45 2026 | ||
| - | [ ... lots of output with your GPU info ... ] | + | [ ... lots of output with your GPU info, VRAM, etc... ] |
| </ | </ | ||
| - | Disable cgroups (won't work for rootless podman | + | === Configure NVIDIA tools === |
| + | |||
| + | Disable cgroups (won't work for rootless podman) by editing the file / | ||
| < | < | ||
| [nvidia-container-cli] | [nvidia-container-cli] | ||
| Line 155: | Line 118: | ||
| ... | ... | ||
| </ | </ | ||
| - | leave the rest of the file untouched | + | leave the rest of the file untouched. |
| - | You need to generate a Common Device Interface (CDI) file which Podman will use to talk to the GPU: | + | You need to generate a Common Device Interface (CDI) file which Podman will use to talk to the GPU (see [[https:// |
| <code bash> | <code bash> | ||
| nvidia-ctk cdi generate --output=/ | nvidia-ctk cdi generate --output=/ | ||
| Line 169: | Line 132: | ||
| INFO[0000] Found 3 CDI devices | INFO[0000] Found 3 CDI devices | ||
| nvidia.com/ | nvidia.com/ | ||
| - | nvidia.com/ | + | nvidia.com/ |
| nvidia.com/ | nvidia.com/ | ||
| </ | </ | ||
| - | Cool! now, with the specific **device** block in the above docker-compose.yml, | ||
| - | **BUT**... there is a caveat! The nvidia | + | === Configure podman passtrough === |
| + | |||
| + | To support GPU acceleration you need the two lines indicated in the compose file above. | ||
| + | |||
| + | This one: | ||
| + | < | ||
| + | devices: | ||
| + | - nvidia.com/ | ||
| + | </ | ||
| + | tells podman | ||
| <code bash> | <code bash> | ||
| - | usermod | + | nvidia-ctk cdi list |
| + | INFO[0000] Found 3 CDI devices | ||
| + | nvidia.com/ | ||
| + | nvidia.com/ | ||
| + | nvidia.com/ | ||
| </ | </ | ||
| - | and logout/ | ||
| - | After restarting the container, this comman | + | This line instead: |
| + | < | ||
| + | annotations: | ||
| + | run.oci.keep_original_groups: | ||
| + | </ | ||
| + | is required because the container will forget the additional groups (of which **video** is required to access the GPU), and this annotation passes to the container the additional groups as well. | ||
| + | |||
| + | |||
| + | === Test GPU in container === | ||
| + | |||
| + | After restarting the container, this commans | ||
| <code bash> | <code bash> | ||
| + | su - openwebui | ||
| podman exec -it ollama nvidia-smi | podman exec -it ollama nvidia-smi | ||
| [ ... output similar to above ... ] | [ ... output similar to above ... ] | ||
| </ | </ | ||
| + | |||
| + | |||
| + | ===== Reverse Proxy ===== | ||
| + | |||
| + | Open WebUI can be hosted on subdomain, let's assume you choose **ai.mydomain.com**. | ||
| + | |||
| + | As usual you want it protected by the Reverse Proxy, so create the **ai.conf** file: | ||
| + | <file - ai.conf> | ||
| + | server { | ||
| + | server_name ai.mydomain.com; | ||
| + | listen 443 ssl; | ||
| + | listen 8443 ssl; | ||
| + | http2 on; | ||
| + | |||
| + | access_log / | ||
| + | error_log / | ||
| + | |||
| + | location / { # The trailing / is important! | ||
| + | proxy_pass | ||
| + | proxy_set_header | ||
| + | proxy_set_header | ||
| + | proxy_http_version 1.1; | ||
| + | proxy_buffering off; | ||
| + | proxy_set_header Upgrade $http_upgrade; | ||
| + | proxy_set_header Connection $connection_upgrade; | ||
| + | proxy_set_header X-Real-IP $remote_addr; | ||
| + | proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | ||
| + | proxy_set_header X-Accel-Internal / | ||
| + | access_log off; | ||
| + | } | ||
| + | include com.mydomain/ | ||
| + | } | ||
| + | </ | ||
| + | add this config file to NGINX (see [[selfhost: | ||
| + | |||
| + | Now go with browser to **https:// | ||
| + | |||
| ===== Configuration ===== | ===== Configuration ===== | ||