Hacker Public Radio

HPR4337: Open Web UI


Listen Later

This show has been flagged as Explicit by the host.

OpenWebUI notes ...

Open WebUI installer: https://github.com/freeload101/SCRIPTS/blob/master/Bash/OpenWebUI_Fast.bash

Older Professor synapse prompt you can use: https://raw.githubusercontent.com/freeload101/SCRIPTS/refs/heads/master/Prof%20Synapse%20Old.txt

Fabric prompts you can import into openwebui !!! ( https://github.com/danielmiessler/fabric/tree/main/patterns
) https://github.com/freeload101/SCRIPTS/blob/master/MISC/Fabric_Prompts_Open_WebUI_OpenWebUI_20241112.json

Example AT windows task startup script to make it start and not
die on boot https://github.com/freeload101/SCRIPTS/blob/master/MISC/StartKokoro.xml

Open WebUI RAG fail sause ... https://youtu.be/CfnLrTcnPtY

Open registration 


Model list / order

NAME                                                   ID              SIZE      MODIFIED
hf.co/mradermacher/L3-8B-Stheno-v3.2-i1-GGUF:Q4_K_S    017d7a278e7e    4.7 GB    2 days ago
qwen2.5:32b                                            9f13ba1299af    19 GB     3 days ago
deepsex:latest                                         c83a52741a8a    20 GB     3 days ago
HammerAI/openhermes-2.5-mistral:latest                 d98003b83e17    4.4 GB    2 weeks ago
Sweaterdog/Andy-3.5:latest                             d3d9dc04b65a    4.7 GB    2 weeks ago
nomic-embed-text:latest                                0a109f422b47    274 MB    2 weeks ago
deepseek-r1:32b                                        38056bbcbb2d    19 GB     4 weeks ago
psyfighter2:latest                                     c1b3d5e5be73    7.9 GB    2 months ago
CognitiveComputations/dolphin-llama3.1:latest          ed9503dedda9    4.7 GB    2 months ago



Disable Arena models

Documents WIP RAG is not good .


Discord notes;

https://discord.com/channels/1170866489302188073/1340112218808909875


  • Abhi Chaturvedi:  @(Operat0r) try this To reduce latency
    and improve accuracy, modify the .env file: Enable RAG
    ENABLE_RAG=true
  • Use Hybrid Mode (Retrieval + Reranking for better context)
  • RAG_MODE=hybrid
  • Reduce the number of retrieved documents (default: 5)
  • RETRIEVAL_TOP_K=3
  • Use a Fast Embedding Model (instead of OpenAI's Ada-002)
  • EMBEDDING_MODEL=all-MiniLM-L6-v2 # Faster and lightweight .
    Optimize the Vector Database VECTOR_DB_TYPE=chroma
    CHROMA_DB_IMPL=hnsw # Faster search
    CHROMA_DB_PATH=/root/open-webui/backend/data/vector_db.
    Optimize Backend Performance # Increase Uvicorn worker count
    (improves concurrency) UVICORN_WORKERS=4
  • Increase FastAPI request timeout (prevents RAG failures)
  • FASTAPI_TIMEOUT=60
  • Optimize database connection pool (for better query
    performance)
  • SQLALCHEMY_POOL_SIZE=10
  • So probably the first thing to do is increase the top K value
    in admin -> settings -> documents, or you could try the
    new "full context mode" for rag documents. You may also need
    to increase the context size on the model, but it will make it
    slower, so you probably don't want to do that unless you start
    seeing the "truncating input" warnings.
  • @JamesK
  • So probably the first thing to do is increase the top K value
    in admin -> settings -> documents, or you could try the
    new "full context mode" for rag documents. You may also need
    to increase the context size on the model, but it will make it
    slower, so you probably don't want to do that unless you start
    seeing the "truncating input" warnings.
  • M]
  • JamesK:  Ah, I see. The rag didn't work great for you in
    this prompt. There are three hits and the first two are
    duplicates, so there isn't much data for the model to work
    with
  • [9:12 PM] JamesK:  context section
  • I see a message warning that you are using the default 2048
    context length, but not the message saying you've hit that
    limit (from my logs the warning looks like
  • level=WARN source=runner.go:126 msg="truncating input prompt"
    limit=32768 prompt=33434 numKeep=5
  • [6:06 AM] JamesK:  If you set the env var OLLAMA_DEBUG=1
    before running ollama serve it will dump the full prompt being
    sent to the model, that should let you confirm what the rag
    has put in the prompt
  • JamesK: Watch the console output from ollama and check for
    warnings about overflowing the context. If you have the
    default 2k context you may need to increase it until the
    warnings go away
  • [8:58 PM] JamesK:  But also, if you're using the default
    rag, it chunks the input into small fragments, then matches
    the fragments against your prompt and only inserts a few
    fragments into the context, not the entire document. So it's
    easily possible for the information you want to not be
    present.


    Auto updates
    echo '0,12 */4 * * * docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui' >> /etc/crontab
    Search

    red note for API keys 


    1. Go to Google Developers, use
      Programmable Search Engine
      , and log on or create account.
    2. Go to
      control panel
      and click
      Add
      button
    3. Enter a search engine name, set the other properties to suit
      your needs, verify you're not a robot and click
      Create
      button.
    4. Generate
      API key
      and get the
      Search engine ID
      . (Available after the engine is created)
    5. With
      API key
      and
      Search engine ID
      , open
      Open WebUI Admin panel
      and click
      Settings
      tab, and then click
      Web Search
    6. Enable
      Web search
      and Set
      Web Search Engine
      to
      google_pse
    7. Fill
      Google PSE API Key
      with the
      API key
      and
      Google PSE Engine Id
      (# 4)
    8. Click
      Save

      Note

      You have to enable
      Web search
      in the prompt field, using plus (
      +
      ) button. Search the web ;-)


      Kokoro / Open Webui 

      https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

      https://github.com/remsky/Kokoro-FastAPI?tab=readme-ov-file

      apt update
      apt upgrade
      curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg   && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |     sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' |     sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
      sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
      sudo apt-get update
      sudo apt-get install -y nvidia-container-toolkit
      apt install docker.io -y
      docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2


      http://localhost:8880/v1

      af_bella

      Import fabric prompts

      https://raw.githubusercontent.com/freeload101/Python/46317dee34ebb83b01c800ce70b0506352ae2f3c/Fabric_Prompts_Open_WebUI_OpenWebUI.py



      Provide feedback on this episode.

      ...more
      View all episodesView all episodes
      Download on the App Store

      Hacker Public RadioBy Hacker Public Radio

      • 4.2
      • 4.2
      • 4.2
      • 4.2
      • 4.2

      4.2

      34 ratings


      More shows like Hacker Public Radio

      View all
      Security Now (Audio) by TWiT

      Security Now (Audio)

      1,971 Listeners

      Off The Hook by 2600 Enterprises

      Off The Hook

      117 Listeners

      No Agenda Show by Adam Curry & John C. Dvorak

      No Agenda Show

      5,935 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      283 Listeners

      LINUX Unplugged by Jupiter Broadcasting

      LINUX Unplugged

      265 Listeners

      BSD Now by JT Pennington

      BSD Now

      89 Listeners

      Open Source Security by Josh Bressers

      Open Source Security

      43 Listeners

      Late Night Linux by The Late Night Linux Family

      Late Night Linux

      154 Listeners

      The Linux Cast by The Linux Cast

      The Linux Cast

      35 Listeners

      Darknet Diaries by Jack Rhysider

      Darknet Diaries

      7,864 Listeners

      This Week in Linux by TuxDigital Network

      This Week in Linux

      36 Listeners

      Linux Dev Time by The Late Night Linux Family

      Linux Dev Time

      21 Listeners

      Hacking Humans by N2K Networks

      Hacking Humans

      314 Listeners

      2.5 Admins by The Late Night Linux Family

      2.5 Admins

      92 Listeners

      Linux Matters by Linux Matters

      Linux Matters

      20 Listeners