This story was originally published on HackerNoon at: https://hackernoon.com/local-llms-need-more-than-openai-compatible-endpoints.
Respawn is a stateful OpenAI Responses API gateway for local LLMs, adding stored responses, tools, streaming, files and observability to Ollama.
Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning.
You can also check exclusive content about #ai, #llm, #open-source, #ollama, #self-hosted-ai, #api, #openai, #local-ai, and more.
This story was written by: @robertomanfreda. Learn more about this writer by checking @robertomanfreda's about page,
and for more stories, please visit hackernoon.com.
Local LLM servers are great at generating tokens, but modern clients expect more than inference: state, lifecycle endpoints, streaming shape, tool protocol, files, and metrics. Respawn is an open-source gateway that sits in front of Ollama/self-hosted backends and adds OpenAI Responses API semantics locally.