I dug into the Postgres sources to write my own WAL receiver
Posted by alzhi7 3 days ago
Comments
Comment by smj-edison 2 days ago
I eventually did end up getting Jimtcl to be multithread-safe, but it ended up being slower than the naive approach of serializing and deserializing between threads. I've been seriously nerd-sniped since, and have slowly been building my own thread safe interpreter, but I still have to cross check with Jimtcl constantly.
Comment by alzhi7 1 day ago
Comment by JSR_FDED 2 days ago
Given all the corner cases he describes, it seems like a good example of something you would never ever want to vibe code.
Comment by pierrekin 2 days ago
I would absolutely still bring a coding agent with me for a project like this, but I would be in the mindset of “I need to understand and be familiar with every line” rather than say, every function signature or every service behaviour.
So it is almost like vibe coding but the abstraction level is lower?
The question I’ve been asking myself recently is, if the act of thinking about the code from scratch is somehow more good than the potential benefit of being able to let that mechanical part be handled by something else, be it another human or an agent.
To be specific I’m referring to a prompt like “next, add a for loop which iterates over the elements in the array and enumerate an index, then call our function $func by reference for each element.” “Is there a more idiomatic way of doing this in $lang?” etc.
This has the advantage to me of letting me code in languages who’s syntax I don’t know or have forgotten, but I’m not sure whether this is trading some sort of short term gain for long term cost yet.
Comment by globular-toast 2 days ago
When you're fluent in a programming language it's quicker to just type that directly in said programming language...
Instead you're training yourself to be able to say this stuff in English which will never be as powerful.
Comment by cyberpunk 2 days ago
Comment by globular-toast 2 days ago
The example I replied to was more the nuts and bolts of programming. It's the thing you're doing 90% of the time. Changes here have a big impact. That's why for almost my entire career I've had expandable "snippets" in my editor to automatically expand, say, "for" to a for loop skeleton where I fill in the variable names. It's like using an electric screwdriver. You don't lose touch with the screws, it just saves you time and effort.
Typing the entire for loop into an LLM in pseudocode actually seems like a regression compared to that. You don't save any typing. But you lose the ability to work independently. You become dependent on a paid subscription and/or powerful hardware just to do what I've been able to do with a keyboard and hardware you can find for free.
It's similar to writing a letter to someone and having it translated to French, but the reader understands English. Why would you do it?
This changes if you go higher level, of course. This is the temptation that LLMs give us. First it's a for loop, then it's an entire class, then entire modules, then it's only a small step to "vibe coding". What we're still figuring out is where this is actually a benefit. Where can we save effort without compromises? I don't think it's typing out code in English, and I don't think it's vibe coding either. Is there something in between? It's too soon to tell.
Comment by zhainya 2 days ago
Comment by victorbjorklund 2 days ago
Comment by j45 2 days ago
Comment by samokhvalov 2 days ago
I wonder if you considered WAL-G, which is also written in Go
and has this: https://github.com/wal-g/wal-g/blob/master/docs/PostgreSQL.m...
Comment by alzhi7 1 day ago
Yes, I know about this tool, it's great. I watched videos about how it was developed, what difficulties there were in achieving delta backups, and how the developers also spent a ton of time studying the PostgreSQL source code. And I studied the Wal-G source code myself. I just never had to use it at work, since I was used to pgBackRest and, a bit later, to Barman. Wal-G focuses on cloud and universality (i.e., it's not only used for PG, but has a unified interface for many different storage systems).
Initially, I didn't even have the idea of making a complete, reliable tool. Over time, I started striving toward exactly that. When there was an available hypervisor at work, I set up k8s there and ran my receiver for several dev databases, just to test its operation 24/7, setting aggressive config parameters (frequent compression, unloading, cleanup, frequent backups, etc.). At the same time, I was choosing not small databases, but quite real production ones, with various nightly integrations for data population (external APIs, Airflow, and all that), blobs/tablespaces.
And of course I read your articles, and watched a lot of videos