Show HN: GoModel – an open-source AI gateway in Go
Posted by santiago-pl 9 hours ago
Hi, I’m Jakub, a solo founder based in Warsaw.
I’ve been building GoModel since December with a couple of contributors. It's an open-source AI gateway that sits between your app and model providers like OpenAI, Anthropic or others.
I built it for my startup to solve a few problems:
- track AI usage and cost per client or team
- switch models without changing app code
- debug request flows more easily
- reduce AI spendings with exact and semantic caching
How is it different? - ~17MB docker image
- LiteLLM's image is more than 44x bigger ("docker.litellm.ai/berriai/litellm:latest" ~ 746 MB on amd64)
- request workflow is visible and easy to inspect
- config is environment-variable-first by default
I'm posting now partly because of the recent LiteLLM supply-chain attack. Their team handled it impressively well, but some people are looking at alternatives anyway, and GoModel is one.Website: https://gomodel.enterpilot.io
Any feedback is appreciated.
Comments
Comment by neilly 2 hours ago
Comment by santiago-pl 18 minutes ago
Comment by nzoschke 5 hours ago
I'm all in on Go and integrating AI up and down our systems for https://housecat.com/ and am currently familiar and happy with:
https://github.com/boldsoftware/shelley -- full Go-based coding agent with LLM gateway.
https://github.com/maragudk/gai -- provides Go interfaces around Anthropic / OpenAI / Google.
Adding this to the list as well as bifrost to look into.
Any other Go-based AI / LLM tools folks are happy with?
I'll second the request to add support for harnesses with subscriptions, specifically Claude Code, into the mix.
Comment by ewhauser421 4 hours ago
It's a just-bash like variant implemented in Go. Useful for giving a managed bash tool to your agents without a full sandboxing solution.
Comment by santiago-pl 5 hours ago
However, it might be challenging, considering that Claude Code with a subscription no longer officially works with OpenClaw.
Comment by nzoschke 3 hours ago
Perhaps fully automated use is where the line is drawn.
But I also suspect individuals using it for light automated dispatching would be ok too.
Comment by arcanemachiner 4 hours ago
Comment by lackoftactics 1 hour ago
Comment by rolls-reus 4 hours ago
Comment by verdverm 2 hours ago
I can throw my hat into the ring, built on ADK, CUE, and Dagger (all also in Go); CLI, TUI, and VSCode interfaces. It's my personal / custom stack, still need to write up docs. My favorite features are powered by Dagger, sandbox with time travel, forking, jump into shell at any turn, diff between any points.
Good entrypoint folder: https://github.com/hofstadter-io/hof/tree/_next/lib/agent
Comment by hgo 3 hours ago
One problem I have is that yes, LiteLLM key creation is easier than creating it directly at the providers and managing it there for team members and test environments, but if I had a way of generating keys via vault, it would be perfect and such a relief in many ways.
I see what I need on your roadmap, but miss integration with service where I can inspect and debug completion traffic, and I don't see if I would be able to track usage from individual end-users through a header.
Thank you and godspeed!
Comment by santiago-pl 2 hours ago
Currently we have a unified concept of User-Paths. Once you add a specific header OR assign User-Path to an API key, you can track the usage based on this. The User-Path might be youe end-user, internal user or some service. Examples:
/client1/app1
/agents/agent1
/team2/john
/team2/adam
Would this work for you?https://gomodel.enterpilot.io/docs/features/user-path
PS Thanks for the feedback on the Vault integration. Noted.
Comment by pizzafeelsright 6 hours ago
Governance is the biggest concern at this point - with proper logging, and integration to 3rd party services that provide inspection and DLP type threat mitigation.
Comment by ijk 27 minutes ago
Comment by crawdog 6 hours ago
https://sbproxy.dev - engine is fully open source.
Another reason golang is interesting for the gateway is having clear control of the supply chain at compile time. Tools like LiteLLM the supply chain attacks can have more impact at runtime, where the compiled binary helps.
Comment by lackoftactics 5 hours ago
Comment by crawdog 5 hours ago
Comment by mosselman 6 hours ago
What I'd like is for a proxy or library to provide a truly unified API where it will really let me integrate once and then never have to bother with provider quirks myself.
Also, are you also planning on doing an open-source rug pull like so many projects out there, including litellm?
Comment by santiago-pl 6 hours ago
2. Regarding being open-source and the license, I've described our approach here transparently: https://gomodel.enterpilot.io/docs/about/license
Comment by sowbug 6 hours ago
(I'm asking only about the compatibility layer; the other tracking features would be useful even if there were only one cloud LLM API.)
Comment by simonw 5 hours ago
The best effort we have to defining a standard is OpenAI harmony/responses - https://developers.openai.com/cookbook/articles/openai-harmo... - but it's not seen much pickup. The older OpenAI Chat Completions thing is much more of an ad-hoc standard - almost every provider ends up serving up a clone of that, albeit with frustrating differences because there's no formal spec to work against.
The key problem is that providers are still inventing new stuff, so committing to a standard doesn't work for them because it may not cover the next set of features.
2025 was particularly turbulent because everyone was adding reasoning mechanisms to their APIs in subtly different shapes. Tool calls and response schemas (which are confusingly not always the same thing) have also had a lot of variance - some providers allow for multiple tool calls in the same response, for example.
My hunch is we'll need abstraction layers for quite a while longer, because the shape of these APIs is still too frothy to support a standard that everyone can get behind without restricting their options for future products too much.
Comment by harikb 6 hours ago
For example `Claude code` used to set 2 specific beta headers with some version numbers for their Max subscription to be supported.
Oauth tokens for Max plan is different from how their API keys looked. They kind of look similar, but has specific prefix that these tool pre-validate.
It is barely working at this point even within a single provider
Comment by jedisct1 3 hours ago
It’s not just about incompatible APIs, but also about how messages are structured. Even getting reliable tool calling requires a significant amount of work and testing for each individual model.
Just look at LiteLLM’s commit history and open issues/PRs. They’re still struggling with reliable multi-turn tool calling for Gemini, Kimi requires hardcoded rules (so K2.6 is currently unsupported because it’s not on the list), and so on.
Implementing the basic, generic OpenAI/Anthropic protocols is trivial, and at that point it almost feels like building an AI gateway is done. But it isn’t — that’s just the beginning of a long journey of constantly dealing with bugs, changes, and the quirks of each provider and model.
Comment by glerk 5 hours ago
How do you plan on keeping up with upstream changes from the API providers? I have implemented something similar, and the biggest issue I have faced with go is that providers don’t usually have sdk’s (compared to javascript and python), and there is work involved in staying up to date at each release.
Comment by santiago-pl 3 hours ago
Therefore there's a good chance that if they make a minor API-level change, GoModel will handle it without any code changes.
Also, changes to providers' API formats might be less and less frequent. Keeping up typically means adding a few lines of code per month. I'm usually aware of those changes because I use LLMs daily and follow the news in a few places.
As a fallback, GoModel includes a passthrough API that forwards your request to the provider in its original format. That might be useful when an AI provider changes their contract significantly and we haven't caught up yet.
Also, official SDKs aren't bug-free either. Skipping that extra layer and hitting the API directly might actually be beneficial for GoModel.
Comment by vorticalbox 5 hours ago
Comment by lackoftactics 5 hours ago
Comment by swyx 4 hours ago
Comment by lackoftactics 4 hours ago
Comment by swyx 4 hours ago
Comment by lackoftactics 4 hours ago
Comment by santiago-pl 3 hours ago
Comment by pjmlp 8 hours ago
However kudos for the project, we need more alternatives in compiled languages.
Comment by smcleod 2 hours ago
Comment by santiago-pl 2 minutes ago
Comment by santiago-pl 7 hours ago
Comment by goodkiwi 6 hours ago
Comment by Talderigi 8 hours ago
Comment by giorgi_pro 7 hours ago
Comment by driese 6 hours ago
Comment by devmor 6 hours ago
Comment by santiago-pl 5 hours ago
docker run --rm -p 8080:8080 \
-e OPENAI_API_KEY="some-vllm-key-if-needed" \
-e OPENAI_BASE_URL="http://host.docker.internal:11434/v1" \
...
enterpilot/gomodel
I'll add a more convenient way to configure it in the coming days.Comment by indigodaddy 7 hours ago
Comment by santiago-pl 7 hours ago
It looks like a useful feature to have. Therefore, I'll dig into this topic more broadly over the next few days and let you know here whether, and possibly when, we plan to add it.
Comment by tahosin 7 hours ago
One thing I'd love to see is built-in cost tracking per model/route. When you're mixing free and paid models, knowing exactly where your spend goes is critical. Do you have plans for that in the dashboard?
Comment by santiago-pl 7 hours ago
However IIUC what you're asking for - it's already in the dashboard! Check the Usage page.
Comment by immanuwell 4 hours ago
Comment by rvz 7 hours ago
Are there even any benchmarks?
Comment by lackoftactics 5 hours ago
Other than that almost useless it’s faster when this will be io bound and not cpu bound.
Comment by eikenberry 4 hours ago
Comment by lackoftactics 4 hours ago
What I noticed: the website shows a diagram of the litellm SDK communicating with the gateway proxy of GoModel, poor design of benchmarks, the scope of the project in readme vs. depth.
I don't have professional experience in GoLang, so will not comment on quality of code.
There are some genuinely good things about this project and the effort here, but with solid position of Bifrost sitting at a version above 1.0.0 and so many other initiatives in this space, it's a tough market.
Comment by santiago-pl 3 hours ago
You can use it like this:
from litellm import completion
print(completion(
model="openai/gpt-4.1-nano",
api_base="http://localhost:8080/v1",
api_key="your-gomodel-key",
messages=[{"role": "user", "content": "hi"}],
).choices[0].message.content)Comment by lackoftactics 2 hours ago
Comment by phoenixranger 3 hours ago
Comment by anilgulecha 8 hours ago
Comment by santiago-pl 8 hours ago
It's more lightweight and simpler. The Bifrost docker image looks 4x larger, at least for now.
IMO GoModel is more convenient for debugging and for seeing how your request flows through different layers of AI Gateways in the Audit Logs.
Comment by anilgulecha 7 hours ago
Comment by santiago-pl 7 hours ago
Comment by antonvs 4 hours ago
Comment by santiago-pl 8 minutes ago
Comment by rpdaiml 4 hours ago
Comment by pukaworks 6 hours ago