Zeroserve: A zero-config web server you can script with eBPF
Posted by losfair 3 days ago
Comments
Comment by password4321 3 days ago
Edit: it seems I'm just falling behind and the new hotness is https://www.http-arena.com/leaderboard/. Good luck!
Comment by winter_blue 3 days ago
Comment by password4321 3 days ago
Sunsetting the Techempower Framework Benchmarks
Comment by dakolli 3 days ago
Comment by MDA2AV 2 days ago
Comment by dakolli 2 days ago
Optional pagination would be nice, I would use a Zebra pattern on the rows backgrounds and larger row heights but that's just me, and I'll admit making nice tables with lots of data look nice is hard. I'm also not a designer :/
The problem is when I see this color pallete I assume everything is vibe coded and then I have a hard time taking the rest of the project seriously / trusting it. I'm not sure why every llm generated UI uses those colors, but they do. I'm not sure how it ended up so concretely in their training data either because this wasn't ever the norm. I think the models just start with Red Blue Green Yellow and then expect the user to adjust from there, idk.
Comment by MDA2AV 1 day ago
Comment by password4321 2 days ago
Host a design competition and pay contestants with exposure! ;)
Comment by MDA2AV 1 day ago
Comment by password4321 2 days ago
Comment by j_clark 3 days ago
Comment by jarym 3 days ago
My takeaway from this though is that nginx is pretty impressive on its own. Also this stuck out:
It's meant to be an alternative to nginx and Caddy, and the design bet is about configuration. Those servers give you a declarative config language - location blocks, rewrite rules, map directives, try_files - and then, once the declarative language hits its limits, an optional scripting runtime bolted on the side (Lua, or Caddy's plugins). Behavior ends up split across two layers: directives that quietly grow their own control flow, plus scripts that run somewhere in the request lifecycle you have to keep in your head.
I think the bet is misplaced - people prefer configuration over code and long have. The built-ins meet enough peoples needs entirely and they don't need to write C code.
Comment by BobbyTables2 3 days ago
Seems like every configuration file format starts off simple. Look at YAML - the basics started off pretty sensibly.
And then people decided they wanted to get fancier with anchors and aliases. Even GitLab has its own form of conditionals and variables, which is all a bit of a hack (only works in certain places).
Even Apache fell into this with its XML based config format.
So we end up with numerous “bespoke” programming languages for configuration management. Of course enterprise people don’t edit these directly - they script Ansible workflows to remotely perform the surgery.
Sadly, could have skipped all that and just have embedded a Lua/Python/etc. interpreter into servers to do the configuration management. Would be simpler than trying to programmatically edit bespoke config files.
Sure, one will say all the bespoke attempts are optimized for a specific use the way a general language isn’t. Except that only fits a narrow class of toy examples which wouldn’t have needed their machinery in the first place!
Remember Windows INI files? Back in the good ol’ days when code was code and data was data….
Comment by jasonjayr 3 days ago
Comment by NewJazz 3 days ago
Comment by lelandbatey 3 days ago
Comment by high_priest 3 days ago
Comment by ai_fry_ur_brain 3 days ago
Comment by simonw 3 days ago
Comment by zuzululu 3 days ago
LLM enables a lot of good output if you know what you are doing
Comment by ai_fry_ur_brain 3 days ago
Comment by zuzululu 3 days ago
Comment by antonvs 3 days ago
Comment by bflesch 3 days ago
I can accept if stuff is vibe coded and has autogenerated README. But even the announcement blogpost is AI-generated, and I personally have zero data points to see if your understanding of software quality is the same as mine.
It's a weird world, if this would've been announced without any AI disclaimers some years earlier I would've eaten it up without a doubt. But right now if I see a fancy README with several good-looking command line parameters I immediately wonder if the README is hallucinated and the command line parameters actually exist.
Comment by losfair 3 days ago
Comment by rpdillon 3 days ago
FWIW, I like the writeup and concept behind this. Very close to some passions of mine (like serving a website from a single-file archive).
Comment by bflesch 3 days ago
Comment by iririririr 3 days ago
Comment by gigatexal 3 days ago
Small static file (174 B) - the bread and butter of static sites:
server req/s p99
zeroserve 36,681 5.4 ms
nginx 31,226 7.8 ms
Caddy 12,830 22 ms
zeroserve serves small files about 17% faster than nginx on a single core, with a tighter tail. HTML pages, small JSON, CSS - this is the case zeroserve is tuned for.
Large static file (100 KB):
server req/s throughput p99
zeroserve 8,000 782 MB/s 22 ms
nginx 7,600 773 MB/s 28 ms
Caddy 6,084 590 MB/s 44 ms
I'd go with a more storied project that's been audited, battle tested, hardened etc than this upstart. There's not enough improvement to justify the risk.
Comment by tadfisher 3 days ago
Comment by antonvs 3 days ago
I could totally see "Small static files are the bread and butter of static sites" appearing in some pointless deck on a Zoom call.
Comment by shevy-java 3 days ago
Yeah, that is unfortunate. Recently there was this ffmpeg-wasm project. I tested it. It worked. But it was vibe-coded AI. I can't stand AI. Even if things work.
I decided to stay in the oldschool era as much as possible. Clever people publish software. Clever people maintain software. They don't need AI. That's my niche.
We may die out but I still prefer that. (Oh, and only if these clever people write documentation. Many clever people hate writing documentation. I decided a long time ago that if software comes without documentation, it is not worth my time, no matter how great that documentation is. This refers mostly to on-the-application side; I only rarely looked at the Linux documentation, but others stated that it is not too terrible either, so who knows.)
Comment by mmastrac 3 days ago
I think I'd feel more comfortable if I could drop an .rs file into the eBPF dir instead of a .c one. It's already a Rust project! :)
And for some reason I was expecting this to be a kernel-accelerated webserver - if that could be done safely using eBPF that would be amazing!
Also, single-threaded? Forking and sharing an incoming connection queue is basically trivial on Linux, that should be literally just a few lines, even with Rust. Use SO_REUSEPORT and the kernel will do the rest.
FWIW, if you're going to push for io_uring, you should also be pushing kTLS IMO, you'll drastically simplify your design if you can avoid pumping userspace SSL after the handshake.
Comment by losfair 3 days ago
Will implement forking + SO_REUSEPORT. I've been using nftables for things like this so haven't needed it for myself yet :)
Comment by Woodi 2 days ago
Code, as is today, looks [acording to benchmarks] better then nginx, except one case !
There is fcgi in, right ? So all that additional processes are already started in the backend. If benchmarks are real no need to complicate code before some industry adoption. Of course there can be a branch to check possibilities :)
And forking is complicated and full of surprising traps. Even if they are somewhat "standard" historic Unix traps... Case study: Perl - better don't use fork there even if "threads" are in.
Comment by opem 3 days ago
Comment by tekacs 3 days ago
https://github.com/tekacs/zeroserve/commit/b33f261615d20d55b...
It does leave me wondering about other runtimes that could be used as the go-between though, because at the point of compiling Rust, an approach like Cloudflare's Pingora (https://github.com/cloudflare/pingora) which I've tried using before... in _theory_ should be a 'nicer' solution - just historically awkward when I've tried using it the way that I'd have liked. Wish it were more library-shaped!
Comment by andrewstuart 3 days ago
The real question is developer commitment and community - the Caddy and Nginx people have worked constantly on supporting their products. It’s going to take a lot of focus and attention.
Comment by opem 3 days ago
Very interesting idea and thanks for the no bs benchmarks! I wonder if this architecture could be ported to webservers with dynamic content/logic, too.
Comment by razighter777 3 days ago
Comment by mmarian 3 days ago
Comment by dwedge 3 days ago
Comment by arcanemachiner 3 days ago
If this piques your interest, make sure to check out the portable C library used to create it, which is also fascinating:
Comment by rpdillon 3 days ago
Comment by mmarian 3 days ago
Comment by dwedge 3 days ago
Comment by lmc 3 days ago
Comment by mmarian 3 days ago
Comment by lmc 3 days ago
Comment by Fordec 3 days ago
Comment by romania1 2 days ago
Comment by rashkov 3 days ago
Comment by cwillu 3 days ago
Comment by rpdillon 3 days ago
Back in school, I worked on a project called Velox, with a partner - the idea was to take a bz2-compressed dump of the giant XML export of wikipedia, and write a program to serve that copy of wikipedia from disk (this was in 2008-2010? in my master's program, so before Kiwix and the amazing zim dumps they produce). My partner worked on the UI and indexing, and I was focusing on how to parse the bz2 compression format to locate article boundaries in the (giant) XML dump that Wikipedia provides. I ended up putting a lot of time into it because it was a bunch of fun.
Writing this just sent me back to the presentation I made. The slide I wrote back then said:
> Significant original work went into creation of archive access. The Apache BZip2 library that is part of Ant was used as a basis for archive access.
> Modified to support random access to a given byte/bit offset pair within the compressed data stream (BZip2 is not a byte-aligned format) > Extended to index all BZip2 block positions, allowing Java-based pseudo-random access to BZip2 compressed data > Extended to map article IDs to block numbers for constant-time article retrieval, even in BZip2 archives exceeding 5GB in size
> Current article retrieval times are ~2 seconds.
This is back when the archive was ~7GB IIRC. My Kiwix dumps today are ~120GB, but that includes images.
This is the link to the presentation in Google Slides that we wrote back in 2008 or so. The version history shows 2013, but I think some kind of import/conversion happened around that time.
https://docs.google.com/presentation/d/e/2PACX-1vTfrxEqvHbd0...
Comment by dgl 3 days ago
The more interesting trick you can do with zip files for HTTP serving is to serve the compressed deflate stream as gzip, or use Zstd inside zip. Then you have a valid zip file from which bytes can be served directly.
I have some code which does this at https://git.sr.ht/~dgl/deserve/
Comment by hackrmn 3 days ago
Comment by Terretta 3 days ago
The whole site is a single tar file. zeroserve indexes it on load - building a path -> byte-range map - and then serves files by issuing byte-range reads against the tarball itself. Nothing is ever unpacked to disk. The site lives entirely in that one file, so there's no document root for a stray location rule to expose, and a deploy is a single atomic file swap.
OTOH, that could be an LLM justification, since the copy is littered with -isms like "the right shape" or "the surface is broad".
Comment by rashkov 3 days ago
Comment by ksec 3 days ago
Unfortunately, Caddy seems to take less concern on this.
Zeroserve already beats Nginx in performance. Hopefully someday it would catch up to Caddy's features.
Comment by b112 3 days ago
You don't serve up a bazillion js files and care about latency. You also don't serve up files from all over the web (fonts from google, jquery or whatever from their site) unless you don't care about having control over your own latency.
A static HTML page renders in under 20ms for me these days, if the site is near. Some of these pages with immense blather of js take > 10 seconds to fully download and render. So... in that world, who cares if it's 5 seconds or 6 seconds?
Comment by lost9 3 days ago
Comment by z3ratul163071 3 days ago
Comment by Lapsa 3 days ago
Comment by MagicMoonlight 3 days ago
Comment by rpdillon 3 days ago
Is this founded? I know it's popular to trash every new project that's built with AI in the pipeline somewhere, but there's a big difference between a project built by someone with years of experience writing software and some vibing a repo with Claude. Isn't it worth distinguishing between the two?
In the case of this project: a non-technical person couldn't even conceive of this product. So I think some kind of evidence should be presented before we say things like "Is it a logical design? Who knows."
EDIT: Taking a look at the source, it looks very good. https://github.com/losfair/zeroserve/blob/main/src/server.rs is an important part of the functionality, and it looks like it had a fair amount of attention paid to the implementation, including multiple refactors.
Comment by jhack 3 days ago