Your job is to deliver code you have proven to work
Posted by simonw 12 hours ago
Comments
Comment by endorphine 11 hours ago
It's even worse than that: non-junior devs are doing it as well.
Comment by snarf21 6 hours ago
Comment by colechristensen 4 minutes ago
Comment by abustamam 4 hours ago
Comment by dejj 3 hours ago
Comment by tremon 3 hours ago
Comment by rTX5CMRXIfFG 1 hour ago
Comment by teaearlgraycold 4 hours ago
Comment by snarf21 4 hours ago
Comment by abustamam 4 hours ago
Comment by xnx 7 hours ago
Comment by theshrike79 7 hours ago
There's a difference between 10 years of experience and 1 year of experience 10 times.
YOE isn't always a measurement of quality, you can work the same dead-end coding job for 10 years and never get more than "1 year" of actual experience.
Comment by stephen_cagle 6 hours ago
I'm not really arguing anything here, but it is interesting that we value breadth over (hopefully) depth/mastery of a specific thing in regards to what we view as "Senior" in software.
Comment by lokar 6 hours ago
This is what that saying is about
Comment by stephen_cagle 6 hours ago
Comment by 3acctforcom 4 hours ago
Situational Leadership gets into this. You want a really efficient McDonalds worker who follows the established procedure to make a Big Mac. You also want a really creative designer to build your Big Mac marketing campaign. Your job as a manager is figuring out which you need, and fitting the right person into the right job.
Comment by abustamam 4 hours ago
I think the concept of Full-stack dev is fine, but expecting them to know each part of the stack deeply isn't feasible imo.
Comment by theshrike79 3 hours ago
BUT they're completely wasted if you just use them to turn JIRA tickets into end to end features =)
Comment by theshrike79 3 hours ago
There is the one doctor who learned one way to do the operation at school, with specific instruments, sutures etc. and uses that for 1000 surgeries.
And then there's the curious one who actively goes to conferences, reads publications and learns new better ways to do the same operation with invisible sutures that don't leave a scar or tools that are allow for more efficient operations, cutting down the time required for the patient to be under anaesthesia.
Which one would you hire for your hospital for the next 25 years?
Comment by WalterBright 3 hours ago
Comment by gopher_space 5 hours ago
Didn’t Bruce Lee famously say he fears the man who’s authored one API in ten thousand different contexts?
Comment by cindyllm 3 hours ago
Comment by nijave 44 minutes ago
Comment by NumberCruncher 3 hours ago
Comment by ChrisMarshallNY 5 hours ago
It's humbling, but I do tend to pick up a lot of stuff.
https://littlegreenviper.com/miscellany/thats-not-what-ships...
Comment by theshrike79 3 hours ago
Comment by ChrisMarshallNY 3 hours ago
It will usually home right in on the bug, or will give me a good starting point.
It's also really good at letting me know if this behavior is a "commonly encountered" one, with a summary of ways it's addressed.
I've probably done that at least a dozen times, today. I guess I'm a rotten programmer.
Comment by theshrike79 3 hours ago
Then I wait, look through the plan and tell it to implement and go do something else.
After a while I check the diffs and go "huh, yea, that's how I would've done it too", commit and push.
Comment by code_for_monkey 5 hours ago
Comment by theshrike79 3 hours ago
I connect to a 3rd party API with shitty specs and inconsistent output that doesn't follow even their spec, swear a bit and adjust my estimates[0]. Do some business stuff with it and shove it to another API.
But I've done that now in ... six maybe seven different languages and a few different frameworks on top of that. And because both sides of the API tend to be a bit shit, there's a lot of experience in defensive coding and verification - as well as writing really polite but pointed Corporate Emails that boil down to "it's your shit that's broken, not ours, you fix it".
At this point I really don't care what language I have to use, as long as it isn't Java (which I've heard has come far in the last decade, but old traumas and all that =).
[0] best one yet is the Swedish "standard" for electricity consumption reports, pretty much every field is optional because they couldn't decide and wanted to please every company in on the project. Now write a parser for that please.
Comment by joquarky 7 hours ago
Comment by theshrike79 3 hours ago
You get a feel for what works and what doesn't, provided you know the relevant facts. Doing a 10RPS system is completely different than 300RPS. And if the payload is 1kB the problems aren't the same as with the one with a 10MB payload.
And if (when) you're using a cloud environment, which one is cheaper, large data or RPS? It's not always intuitive. We just had our AWS reps do a Tim "The Toolman" Taylor "HUUH?!" when we explained that the way our software works is 95% cheaper to run using S3 as the storage rather than DynamoDB :D
Comment by teaearlgraycold 4 hours ago
Comment by BadCookie 6 hours ago
I am afraid that this “1 year of experience 10 times” mantra gets trotted out to justify ageism more often than not.
Comment by theshrike79 3 hours ago
Not all people are curious, they go to school, learn to code and work their job like a normal 9-5 blue collar worker. They go to company trainings, but they don't read Hacker News, don't follow the latest language fads or do personal software projects during nights and weekends. It's just a day job for them that pays for their non-programming hobbies.
I've had colleagues who managed the same ASP+Access DB system for almost a decade, with zero curiosity or interest to learn anything that wasn't absolutely necessary.
We had to drag them to the ASP.NET age, one just wouldn't and stayed back managing the legacy version until all clients had moved to the new stack.
...and I just checked LinkedIn, the non-curious ones are still in the same company, managing the same piece of SaaS as a Software Developer. 20-26 years in the same company, straight from school.
Comment by 3acctforcom 4 hours ago
Comment by abustamam 4 hours ago
My career trajectory is wild. At this rate I'll be CTO soon, then back to mid-level.
Comment by SoftTalker 7 hours ago
Comment by tensor 7 hours ago
Comment by CodeMage 7 hours ago
I can understand not wanting to let people stay in a junior position forever, but I've seen this taken to a ridiculous extreme, where the ladder starts at a junior level, then goes through intermediate and senior to settle on staff engineer as the first "terminal" position.
Someone should explain to the people who dream up these policies that the Peter Principle is not something we should aim for.
It's even worse when you combine this with age. I'm nearing 47 years old now and have 26 years of professional experience, and I'm not just tired, but exhausted by the relentless push to make me go higher up on the ladder. Let me settle down where I'm at my most competent and let me build shit instead of going to interminable meetings to figure out what we want to build and who should be responsible for it. I'm old enough to remember the time when managers were at least expected to be more useful in that regard.
Comment by lokar 6 hours ago
And honestly, this will depend on the environment and kind of work being done.
Comment by SoftTalker 6 hours ago
Of course the pay won't be great, but the benefits are decent, PTO is usually excellent, and the work environment usually very low stress.
Comment by CodeMage 5 hours ago
That said, there's something deeply wrong with our industry if that's the way we expect things to work. I never felt that teaching was my calling, but I might end up being forced into it anyway and taking up a job that someone with proper passion and vocation could fill. Why? Because my own industry doesn't understand that unlimited growth is not sustainable.
For that matter, "growth" is not the right word, either. We're all being told that scaling the ladder is the same thing as growing and developing, but it's not.
Comment by lokar 4 hours ago
Comment by CodeMage 3 hours ago
That said, there is an expectation of unlimited growth and it comes from a different source: ageism. At my age, the implicit expectation is that I will apply for a staff or even principal role. Applying for a "merely" senior role often rings alarm bells.
That trend -- and certain others -- are what's making me consider taking up teaching instead.
Comment by adobesubmarine 1 hour ago
Comment by lokar 5 hours ago
The point of the terminal level rule is that there is a point, bellow which you are not actually contributing all that much more in output then it takes to supervise and mentor you. At some point you need to be clearly net positive. This generally means you can mostly operate on your own.
If it becomes clear you won't make it to that level, then something is wrong. Either you are not capable, or not willing to make the effort, or something else. Regardless, you get forced out.
Comment by theshrike79 6 hours ago
And according to the company experience chart, they should've been a "thought leader" and "able to instruct senior engineers"
My title? Backend Programmer (20 years of experience). Our unit didn't care about titles because there was a "budget" for title upgrades per business unit and guess which team grabbed all of them =)
Comment by geodel 5 hours ago
Since they bring a certain cluelessness and ignorance as honor to whole orgs actual technical expertise among engineers could be detriment to one's jobs and career.
Comment by tguvot 3 hours ago
in last week I resolved a few legal/regulatory problems that could have cost company tens of millions of dollars in fines/direct spend/resources and prevented few backend teams from rolling out functionality that could have negative impact on stability/security/performance. I did offer them alternative ways to implement whatever they needed and they accepted it
Comment by xnx 6 hours ago
It began with "software engineer"
Comment by jmpeax 6 hours ago
Comment by jghn 7 hours ago
Comment by lokar 5 hours ago
One such period seems to have ended sometime around the start of Covid, or a bit before.
Comment by 9rx 6 hours ago
There is nothing left. Not everyone puts in the same dedication towards the craft, of course. It very well might take someone 30 years to reach "principle" (and maybe even never). But 5 years to have "seen it all" is more than reasonable for someone who has a keen interest in what they are doing. It is not like a job dependent on the season, where you only get one each year. In computing, you can see many different scenarios play out in milliseconds. It doesn't need years to go from no experience to having "seen it all".
That is why many in this industry seek management roles as a next step. It opens a new place to find scenarios one has never seen before; to get the start the process all over again.
Comment by Yoric 6 hours ago
I've been programming since I was 7 and I'm old enough to remember the previous AI summer. Somewhere along the way, I've had impact on a few technologies you have heard of, I've coded at almost all levels from (some very specialized) hardware to Prolog, Idris and Coq/Rocq, with large doses of mainstream languages in-between, and I don't think I'll ever be close to having seen in all.
If anyone tells me that they've seen it all in 5 years, I'm going to suspect them of not paying attention.
Comment by andrewaylett 5 hours ago
I've seen a lot. But the more I see, the more I find to see.
Comment by 9rx 5 hours ago
If your job is dependent on the weather, one year might be rainy, one year might be drought, one year might be a flood, etc. You need to see them to understand them. But eventually you don't have to need to see the year where it is exceptionally rainy, but not to the point of flood, to be able to make good decisions around it. You can take what you learned in the earlier not-quite-so rainy year and what you learned during the flood year and extrapolate from that what the exceptionally rainy year entails. That is what levels up someone.
Much the same is true in software. For example, once you write a (well-written) automated test in Javascript and perhaps create something in Typescript, you also have a pretty good understanding of what Rocq is trying to do well enough to determine when it would be appropriate for you to use. It would no doubt take much, much longer to understand all of its minutia, but it is not knowledge of intimate details that "senior", "principle", etc. is looking for. It is about being able to draw on past experience to make well-reasoned choices going forward.
Comment by Yoric 4 hours ago
You need a very different mindset to write in JS (or TS), in Rust, in Rocq, in Esterel or on a Quantum Computer. You need a very different mindset when coding tools that will be deployed on embedded devices, on user's desktops, in the Linux kernel, on a web backend or in a compiler. You need a very different mindset when dealing with open-source enthusiasts, untrusted users, defense contractors.
You might be able to have "seen it all" in a tiny corner of tech, but if you stop there, I read it as meaning that you don't have enough curiosity to leave your comfort zone.
It's fine, you don't really have to if you don't want to.
Comment by 9rx 3 hours ago
"Senior", "principle", etc. are not about your ability to write. They speak to one's capacity to make decisions. A "junior" has absolutely no clue when to use JS, Rust, or Rocq, or if code should be written at all. But someone who has written (well-written) tests in JS, and maybe written some types in Typescript, now has some concept of verification and can start to recognize some of the tradeoffs in the different approaches. With that past experience in hand, they can begin to consider if the new project in front of them needs Rocq, Dafny, or if Javascript will do. Couple that with other types of experiences to draw from and you can move beyond being considered a "junior".
> You might be able to have "seen it all" in a tiny corner of tech
Of course there being a corner of some sort is a given. We already talked about management being a different corner, for example. Having absolutely no experience designing a PCB is not going to keep you a "junior" at a place developing CRUD web apps. Obviously nobody is talking about "seeing it all" as being about everything in the entire universe. There aren't that many different patterns, really, though. As the terms are used, you absolutely can "see it all", and when you don't have to wait around for the season to return next year, you can "see it all" quite quickly.
Comment by neighbour 3 hours ago
For example, I work in Data & AI and we have:
- data engineer
- analytics engineer
- data scientist
- AI engineer
What I don't know is what's the alternative?
Data Engineers are basically software developers.
Analytics Engineers were Data Analysts or BI Analysts but the job has changed so much that neither of those titles fit.
My opinion is that basically everyone should just be a "Developer" or "Programmer" and then have the area suffixed:
- Data Engineer → Developer (Data Infrastructure)
- Analytics Engineer → Developer (Analytics)
etc.
Comment by tunesmith 2 hours ago
Comment by xyzzy_plugh 2 hours ago
Why?
> and part of the process toward encouraging them to use LLM as a tool for their work.
Did you look at it from their perspective? You set the exact opposite example and serve as a perfect example for TFA: you did not deliver code you have proven to work. I imagine some would find this demoralizing.
I've worked with a lot of director-level software folk and many would just do the work. If they're not going to do the work, then they should probably assign someone to do it.
What if it didn't work? What if you just wasted a bunch of engineering time reviewing slop? I don't comprehend this mindset. If you're supposedly a leader, then lead.
Comment by duxup 6 hours ago
I always think of the "superstars" or "10x" devs I have met at companies. Yeah I could put out a lot of features too if I could bypass all the rules and just puke out code / greenfield code that accounts for the initial one use case ... (and sometimes even leave the rest to other folks to clean up).
Comment by beautiful_zhixu 3 hours ago
Comment by analog31 10 hours ago
Comment by tyrust 9 hours ago
Perhaps a spicy patch would involve some kind of meeting. Or maybe in a mentor/mentee situation where you'd want high-bandwidth communication.
Comment by jopsen 9 hours ago
It's probably also fairly expensive to do.
Comment by jghn 8 hours ago
There are pros & cons to both sides. As you point out it's quite expensive in terms of time to do the in person style. Getting several people together is a big hassle. I've found that the code reviews themselves, and what people get out of them, are wildly different though. In person code reviews have been much more holistic in my experience, sometimes bordering on bigger picture planning. And much better as a learning tool for other people involved. Whereas the diff style online code review tends to be more focused on the immediate concerns.
There's not a right or wrong answer between those tradeoffs, but people need to realize they're not the same thing.
Comment by stephen_cagle 6 hours ago
It also increases the coverage area of code that each developer is at least somewhat familiar with.
On a side note, I would love if the default was for these code reviews to be recorded. That way 2 years later when I am asked to modify some module that no one has touched in that span, I could at least watch the code review and gleem something about how/why this was architect-ed the way it was.
Comment by lokar 6 hours ago
Comment by colinb 8 hours ago
Comment by comprev 9 hours ago
Comment by throwaway314155 8 hours ago
Comment by SoftTalker 7 hours ago
Comment by necovek 7 hours ago
I was in a team in 2006 where we did the regular, 2-approve-code-reviews-per-change-proposal (along with fully integrated CI/CD, some of it through signed email but not full diffs like Linux patchsets, but only "commands" what branch to merge where).
Comment by marwamc 4 hours ago
Comment by SoftTalker 6 hours ago
Comment by ok_dad 7 hours ago
Comment by marwamc 4 hours ago
I found these help ground the mentorship and discussions between junior-senior devs. And so even for the enterprising aka proactive junior devs who might start working on something in advance of plans/roadmaps, by the time they present that work for review, if the work followed org architectural and design patterns, the review and acceptance process flows smoothly.
In my juinior days I was taught: if the org doesn't have a design or architectural SOP for the thing you're doing, find a couple of respectable RFCs from the internet, pick the three you like, and implement one. It's so much easier to stand on the shoulders of giants than to try and be the giant yourself.
Comment by stuaxo 9 hours ago
Comment by DrewADesign 9 hours ago
Comment by rootusrootus 9 hours ago
That solves some of the problem with people thinking it's okay to fire off a huge AI slop PR and make it the reviewer's responsibility to see how much the LLM hallucinated. No, you have to look at yourself first, because it's YOUR code no matter what tool you used to help write it.
Comment by muzzio 7 hours ago
It makes it doubly annoying though whenever I go digging in `git blame` to find a commit with a terrible title, no description and an "LGTM" approval though.
Comment by unbalancedevh 8 hours ago
I'm having a hard time imagining the alternative. Do junior developers not take any pride in their work? I want to be sure my code works before I submit it for review. It's embarrassing to me if it fails basic requirements. And as a reviewer, what I want to see more than anything is how the developer assessed that their code works. I don't want to dig into the code unless I need to -- show me the validation and results, and convince me why I should approve it.
I've seen plenty of examples of developers who don't know how to effectively validate their work, or document the validation. But that's different than no validation effort at all.
Comment by rootusrootus 8 hours ago
Yes. I have lost count of the number of PRs that have come to me where the developer added random blank lines and deleted others from code that was not even in the file they were supposed to be working in.
I'm with you -- I review my own PRs just to make sure I didn't inadvertently include something that would make me look sloppy. I smoke test it, I write comments explaining the rationale, etc. But one of my core personality traits (mostly causing me pain, but useful in this instance) is how much I loathe being wrong, especially for silly reasons. Some people are very comfortable with just throwing stuff at the wall to see if it'll stick.
Comment by alfons_foobar 7 hours ago
Maybe some kind of auto-formatter?
Comment by rootusrootus 6 hours ago
In any case, just looking over your own PR briefly before submitting it catches these quickly. The lack of attention to detail is the part I find more frustrating than the actual unnecessary format changes.
Comment by ok_dad 7 hours ago
That’s not a great example of lack of care, of you use code formatters then this can happen very easily and be overlooked in a big change. It’s also really low stakes, I’m frankly concerned that you care so much about this that you’d label a dev careless over it. I’d label someone careless who didn’t test every branch of their code and left a nil pointer error or something, but missing formatter changes seems like a very human mistake for someone who was still careful about the actual code they wrote.
Comment by hoten 7 hours ago
Comment by code_for_monkey 5 hours ago
Comment by epiccoleman 6 hours ago
No kidding. I mean, "it works" is table stakes, to the point I can't even imagine going to review without having tested things locally at least to be confident in my changes. The self-review for me is to force me to digest my whole patch and make sure I haven't left a bunch of TODO comments or sloppy POC code in the branch. I'd be embarrassed to get caught leaving commented code in my branch - I'd be mortified if somehow I submitted a PR that just straight up didn't work.
Comment by jjmarr 8 hours ago
Their goal is to pass the hot potato to someone else, so they can say in the standup "oh I'm waiting on review" making it not their problem.
Comment by lokar 6 hours ago
Things like real review as an important part of the work requires a culture that values it.
Comment by theshrike79 6 hours ago
It catches the worst slop in the first pass easily, as well as typos etc.
Comment by groby_b 7 hours ago
Comment by SoftTalker 7 hours ago
Comment by hnthrow0287345 10 hours ago
This might be unpopular, but that is seeming more like an opportunity if we want to continue allowing AI to generate code.
One of the annoying things engineers have to deal with is stopping whatever they're doing and doing a review. Obviously this gets worse if more total code is being produced.
We could eliminate that interruption by having someone doing more thorough code reviews, full-time. Someone who is not being bound by sprint deadlines and tempted to gloss over reviews to get back to their own work. Someone who has time to pull down the branch and actually run the code and lightly test things from an engineer's perspective so QA doesn't hit super obvious issues. They can also be the gatekeeper for code quality and PR quality.
Comment by marcosdumay 10 hours ago
This is not the first time somebody had that idea.
Comment by JohnBooty 9 hours ago
That would definitely solve the "code reviewer loses touch with reality" issue.
Whether it would be a net reduction in disruption, I don't know.
Comment by necovek 6 hours ago
In general, back in 2000s, a team I was on employed a simple rule to ensure reviews happen in a timely manner: once you ask for a review, you have an obligation to do 2 reviews (as we required 2 approvals on every change).
The biggest problem was when there wasn't stuff to review, so you carried "debt" over, and some never repaid it. But with a team of 15-30 people, it worked surprisingly well: no interrupts, quick response times.
It did require writing good change descriptions along with testing instructions. We also introduced diff size limits to encourage iterative development and small context when reviewing (as obviously not all 15-30 people had same deep knowledge of all the areas).
Comment by bee_rider 9 hours ago
Comment by kaffekaka 8 hours ago
Comment by jaggederest 9 hours ago
Comment by kragen 8 hours ago
Comment by mywittyname 3 hours ago
In reality, this ends up being the job given to the weakest person on the team to keep them occupied. And it gives the rest of the team a mechanism to get away with shoddy work and not face repercussions.
Maybe I'm just jaded, but I think this approach would have horrible results.
AI code review tools are already good. That makes for a good first pass. On my team, fixing Code Rabbit's issues, or having a good reason not to is always step 1 to a PR. It catches a lot of really subtle bugs.
Comment by sorokod 10 hours ago
I would have thought that reviewing PRs and doing it well is in the job description. You latter mention "someone" a few times - who that someone might be?
Comment by bee_rider 9 hours ago
“You are a cranky senior software engineer who loves to nitpick change requests. Here are your coding standards. You only sign off of a change after you are sure it works; if you run out of compute credits before you can prove it to yourself, reject the change as too complex.”
Balance things, pit the LLMs against each other.
Comment by postflopclarity 8 hours ago
Comment by osn9363739 4 hours ago
Comment by bee_rider 40 minutes ago
What sort of thing does it find? Bad smells (possibly known imperfections but least-bad-picks), bugs (maybe triaged), or violations of the coding guides (maybe known and waivered)?
I wonder if there’s a need for something like a RAG of known issues…
Comment by mywittyname 3 hours ago
I do agree that the tool we use (code rabbit) is a little too nitpicky, but it's right way more than it's wrong.
Comment by jjmarr 8 hours ago
Comment by cm2012 9 hours ago
Comment by gottagocode 9 hours ago
This is effectively my role (outside of mentoring) as a lead developer over a team of juniors we train in house. I'm not sure many engineers would enjoy a day of only reviewing, me included.
Comment by jms703 10 hours ago
Code reviews are a part of the job. Even at the junior level, an engineer should be able to figure out a reasonable time to take a break and shift efforts for a bit to handle things like code reviews.
Comment by shiandow 8 hours ago
Thr only way to be 100% sure is writing it myself. If I know some one reasonable managed to write the code I can usually take some shortcuts and only look at the code style, common gotchas etc.
Of course it wouldn't be the first time I made some erroneous assumptions about how well considered the code was. But if none of the code is the product of any intelligent thought well, I might as well stop reading and start writing. Reading code is 10x harder than writing it after all.
Comment by Spoom 8 hours ago
Comment by immibis 10 hours ago
Comment by nunez 8 hours ago
Comment by ohwaitnvm 9 hours ago
Comment by sodapopcan 9 hours ago
Comment by gaigalas 9 hours ago
In practice, you should have at least one independent reviewer who did not actively worked on the PR.
That reviewer should also download the entire code, run it, make tests fail and so on.
In my experience, it's also good that this is not a fixed role "the reviewer", and a responsability everyone in the team shares (your next task should always be: review someone else's work, only pick a new thing to do if there is nothing to review).
This practice increases quality dramatically.
Comment by sodapopcan 8 hours ago
Yes it does. There are many ways to do things, of course, and you can institute that there must be an independent reviewer, but I see this is a colossal waste of time and takes away one of the many benefits of pairing. Switch pairs frequently, and by frequently I really mean "daily," and there is no need for review. This also covers "no fixed responsibilities" you mentioned (which I absolutely agree with).
Again, there are no rules for how things must be done, but this is my experience of three straight years working this way and it was highly effective.
Comment by gaigalas 7 hours ago
Excited (or maybe even stubborn) developers can often win their pairs by exhaustion, leading to "whatever you want" low effort contributions.
Pairs tend to under-document. They share an understanding they developed during the pairing session and forget to add important information or details to the PR or documentation channels.
I'm glad it has been working for you. Maybe you work in a stellar team that doesn't have those issues. However, there are many scenarios that benefit a lot from an independent reviewer.
Comment by aslakhellesoy 9 hours ago
Comment by Aurornis 10 hours ago
Pretending it’s just the kids and young people doing the bad thing makes the outrage easier to sell to adults.
Comment by jennyholzer2 10 hours ago
Comment by reedf1 11 hours ago
Comment by esafak 10 hours ago
Comment by rvz 10 hours ago
A great time to be a vibe coding cleanup specialist (i.e, professional security software engineer)
Comment by snowstormsun 10 hours ago
Comment by marcosdumay 10 hours ago
Vote with your wallet.
Comment by pydry 10 hours ago
Comment by vernrVingingIt 10 hours ago
Developers have created too many layers of abstraction and indirection to do their jobs. We're burning a ton of energy managing state management frameworks, that are many layers of indirection away from the computations that are salient to users.
All those DSLs, config syntaxes, layers of boilerplate waste a huge amount of electricity, when end users want to draw geometric shapes.
So a non-dev generates a mess, but in a way so do devs with Django and Elixir, RoR, Terraform. When really end of the day it's matrix math against memory and sync of that state to the display.
From a hardware engineers perspective, the mess of devs and non-devs is the same abstract mess of electrical states that have nothing to do with the goal. All those frameworks can be generalized into a handful of electrical patterns, saving a ton of electricity.
Comment by rcbdev 10 hours ago
Trying to create a secure, reliable and scalable system that enables many people to work on one code base, share their code around with others and at the end of the day coordinate this dance of electrons across multiple computers, that's where all of these 'useless' layers of abstraction become absolutely necessary.
Comment by vernrVingingIt 10 hours ago
I know exactly what those layers of abstraction are used for. Why so many? Jobs making layers of abstraction.
But all of them are dev friendly means of modeling memory states for the CPU to watch and transform just so. They can all be compressed into a generic and generalized set of mathematical functions ridding ourselves of the various parser rules to manage each bespoke syntax inherent to each DSL, layers of framework.
Comment by kridsdale3 9 hours ago
Go write an operating system and suite of apps with global memory and no protections. Why are we wasting so much time on abstractions like processes and objects? Just let let everyone read and write from the giant turing machine.
Comment by jodrellblank 5 hours ago
Aside: earlier this year Casey Muratori did a 2.5 hour conference talk on this topic - why we are using objects in the way they are implemented in C++ et al with class hierarchies and inheritance and objects representing individual entities? "The Big OOPs: anatomy of a 35 year mistake"[1].
He traces programming history back to Stroustrup learning from Simula and Kirstan Nygaard, back to C.A.R. Hoare's paper on records, back to the Algol 68 design committee, back to Douglas T. Ross's work in the 1950's. From Ross at MIT in 1960 to Ivan Sutherland working on Sketchpad at MIT in 1963, and both chains influencing Alan Kay and Smalltalk. Where the different ideas in OOP came from, how they came together through which programming languages, who was working on what, and when, and why. It's interesting.
Comment by jjmarr 8 hours ago
Abstraction layers are terrible when you need to understand 100% of the code at all times. Doesn't mean they're not useful.
Heck, the language for just implementing mathematical rules about system behaviour into code exists. It's called Matlab Simulink.
Comment by nec4b 7 hours ago
Comment by vernrVingingIt 5 hours ago
There is zero obligation to capture the concept of memory safety in traditional software notation. If it was possible to look inside the hardware at runtime no one is going to see Rust syntax.
At runtime it's more appropriate to think of it as geometric functions redrawing electrical state geometrically to avoid collisions. And that's where future chip and system state management are headed.
Away from arbitrary syntax constructs with computational expensive parsing rules of the past towards a more efficient geometric functions abstraction embedded in the machine.
Comment by jcgl 3 hours ago
What does it matter how it's "stored"? I think (hope?) that most SWEs know that that syntax and its semantic aren't how things work on the metal. Storage format of the syntax seems pretty irrelevant. And surely you're not suggesting that SWEs should be using a syntax and semantics that they...don't know.
So what's the better, non-traditional-software notation? Your conceptualization does sound genuinely intriguing.
However, it seems like it would by necessity be non-portable across architectures (or even architecture revisions). And I take it as given that portable software is a desirable thing.
Comment by vlowther 7 hours ago
Comment by kalleboo 1 hour ago
Comment by vernrVingingIt 7 hours ago
You all really think engineers at Samsung, nVidia, etc whose job it is to generalize software into mathematical models have not considered this?
We need a layer of abstraction, not Ruby, Python, Elixir, Rails, Perl, Linux, Windows, etc, ad nauseum, ad infinitum... each with unique and computationally expensive (energy wasting) parsing, serializing and deserializing rules.
Mitigation of climate change is a general concern for the species. Specific concerns of software developers who will die someday anyway get to take a back seat for a change.
Yes AI uses a lot of electricity but so does traditional SaaS.
Traditional SaaS will eventually be replaced with more efficient automated systems. We're in a transition period.
It's computationally efficient to just use geometry[1], which given enough memory, can be shaped to avoid collisions you are concerned with.
Your only real concern is obvious self selection driven social conservatism. "Don't disrupt me ...of all people... bro!"
[1] https://iopscience.iop.org/article/10.1088/1742-6596/2987/1/...
Comment by switchbak 7 hours ago
Let the downvotes commence!
Comment by nkohari 9 hours ago
This is a perfect example of Chesterson's Fence. Is it true that there are too many levels of abstraction, that YAML configuration files are a pain in the ass, and so on? Yes. But it's because this stuff was created organically, by thousands of people, over decades of time, and it isn't feasible to just start over from first principles.
I don't know enough about electrical engineering to speak to it (funny how that works!) but I'm sure there are plenty of cases in EE that just come down to "that's how it's been done forever".
Comment by vernrVingingIt 9 hours ago
Automation is making it pretty easy to generalize all the abstraction into math automatically to inform how to evolve the manufacturing process.
Using American principles against Americans, it would run afoul of American free speech and agency ideals to dictate chip makers only engage in speech and agency that benefits software engineers.
Was in the room 25 years ago being instructed to help offshore hardware manufacturing as it was realized keeping such knowledge and informed workers domestic posed an existential threat to copyright cartels and media censorship interests.
It's a long term goal that was set aside during the ZIRP era as everyone was happy making money hand over fist.
Guess you all should have paid more attention to politics than believe since it only exists as a socialized theory it isn't real and can safely be ignored.
Americans make up a small portion of the 8 billion humans, and software engineers are an even smaller percent of the population. Other nations have rebuilt since the US bombed them to hell. They're not beholden to kowtow to a minority of the overall population.
Would recommend you set aside thinking in abstract philosophy puzzles and relate to world via its real physical properties.
Comment by nec4b 7 hours ago
No, there are thousands of hardware libraries (HDLs, IP cores, Standard cell libs) which chip designers use. Hardly anyone builds chips from first principles. They are using same layers of abstractions as software does.
Comment by vernrVingingIt 6 hours ago
Of course they have not dumped their own methods.
Comment by repeekad 10 hours ago
I like that, I’ve also heard it referred to as “unearned wisdom”
Comment by vernrVingingIt 7 hours ago
We need a layer of abstraction not endless layers.
Nothing says unearned wisdom than script kiddies who intentionally had money thrown at them to reinforce belief their mastery of RoR CRUD app dev is genius beyond all comprehension. Zomg you know Linux admin? Here's $10 million dollars!
This thread is nothing but appeals to banal social conservatism. The disruptors on the verge of being the disrupted lashing out; wait, I was the job killer! Now you say my job is dead! So unfair!
Us hardware engineers been having a good laugh at SWEs easily manipulated the last 20 years by Wall Street hype of copy-paste SaaS products constantly reimplemented in the latest JS framework.
Throwing money at you all was intentional manipulation of primate biology. Juice your egos, get you to fall in line with desired agency control goals of the political and old money cohort.
Comment by switchbak 7 hours ago
Comment by vernrVingingIt 5 hours ago
I have more to do than refresh HN all day, going for brevity here.
Your expectation others must explicitly connect all the dots for you makes me question your grasp of reality. Most people alive are going about their lives unconcerned with your existence altogether.
"Highly charged politics". Relative emotional opinion.
Comment by vernrVingingIt 10 hours ago
Comment by iwontberude 10 hours ago
Comment by stephen_cagle 6 hours ago
Comment by whattheheckheck 10 hours ago
Comment by vernrVingingIt 10 hours ago
Started career in late 90s designing boards for telecom companies network backbones.
Comment by nanomonkey 10 hours ago
Boilerplate comes when your language doesn't have affordances, you get around this with /abstraction/ which leads to DSLs (Domain Specific Languages).
Matrix math is generally done on more than raw bits provided by digital circuits. Simple things like numbers require some amount of abstraction and indirection (pointers to memory addresses that begin arrays).
My point is yes, we've gotten ourselves in a complicated tar pit, but it's not because there wasn't a simpler solution lower in the stack.
Comment by TexanFeller 6 hours ago
Comment by rootusrootus 9 hours ago
As an upside, it helps with AI slop too. Because as I see it, what you're doing when you use an LLM is becoming a code reviewer. So you need to actually read the code and review it! If you have not reviewed it yourself first, I am not going to waste my time reviewing it for you.
It helps obviously that I'm on a small team of a half dozen developers and I'm the lead, and management hasn't even hinted at giving us stupid decrees like "now that you have Claude Code you can do 10x as many features!!!1!".
Comment by rcxdude 1 hour ago
Comment by tyleo 8 hours ago
Title only loosely tracks skill level and with AI, that may become even more true.
Comment by acedTrex 9 hours ago
Its when your PEERS do it that its a huge problem.
Comment by hinkley 6 hours ago
Comment by mullingitover 9 hours ago
The LLMs/agents have actually been doing a stellar job with code reviews. Frankly that’s one area that humans rush through, to the point it’s a running joke that the best way to get a PR granted a “lgtm” is to make it huge. I’ve almost never seen Copilot wave a PR through on the first attempt, but I usually see humans doing that.
Comment by distances 3 hours ago
I rarely see a PR that should pass without comments. Your team is being sloppy.
Comment by mullingitover 2 hours ago
I'm talking about a running joke in the industry, not my team.
Comment by lowkeyokay 10 hours ago
Comment by jjmarr 9 hours ago
Comment by jennyholzer2 8 hours ago
Ruthlessly bully LLM idiots until it becomes so embarrassing to use LLMs that no status-obsessed corporate executive would ever admit they spent years gleefully duped by hucksters selling "General AI"
Comment by cpursley 8 hours ago
Comment by Our_Benefactors 10 hours ago
Comment by rootusrootus 10 hours ago
Comment by lurking_swe 10 hours ago
If i was CTO I would not be happy to hear my engineers are spending lots of time re-writing and testing code written by product managers. Big nope.
Comment by theshrike79 6 hours ago
Comment by strangattractor 9 hours ago
Comment by bee_rider 9 hours ago
Comment by seanmcdirmid 9 hours ago
Comment by throwawaysleep 10 hours ago
Comment by Aurornis 10 hours ago
I don’t know about you, but I get paychecks twice a month for doing things included in my job description.
Comment by georgeburdell 10 hours ago
Now we have nightly builds that nobody checks the result of and we’re finding out about bugs weeks later. Big company btw
Comment by immibis 6 hours ago
Once you've said it's going to cause horrible problems, and they say do it anyway, and you have a paper trail of this and it's backed up onto your own storage medium, then you just do it and bring popcorn. If you think it'll bankrupt the company, then you have nothing to lose since you have no right to stop a company going bankrupt, so you might as well email your manager's manager's manager first and see if your manager gets fired.
Comment by 627467 10 hours ago
[Edit] man, people dont get /s unless its explicit
Comment by fragmede 10 hours ago
Comment by kridsdale3 9 hours ago
Comment by BurningFrog 10 hours ago
Comment by alphazard 9 hours ago
On the contrary, since more effort doesn't yield more money, but less effort can yield the same money, the strategy is to contract the time spent on work to the smallest amount, LLMs are currently the best way to do that.
I don't see why this has to be framed as a bad thing. Why should anyone care about the quality of software that they don't use? If you wouldn't work on it unless you were paid to, and you can leave if and when it becomes a problem, then why spend mental energy writing even a single line?
Comment by tuyiown 9 hours ago
The fact that this is lost as a common knowledge whereas shiny examples arises regularly is very telling.
But it is not liked in business because reproducing it requires competence in the industry, and finance deep pockets don’t believe in competence anymore.
Comment by phito 6 hours ago
Comment by hostyle 9 hours ago
Comment by Larrikin 9 hours ago
Comment by ThrowawayR2 7 hours ago
Comment by alphazard 8 hours ago
But if you are building it because doing so is in the long chain of cause and effect that leads to you being fed and having shelter, then you should minimize the amount of your time that is required to produce that end result. Do you get better food, and better shelter if the software is better? It would certainly be nice if that was the case, but it's not.
> Not everything is about money.
Except for your job, which is primarily about money. Making it take less time, means that you have more time to focus on things that really are not about money.
Comment by hostyle 8 hours ago
Comment by robgibbons 11 hours ago
From there, I include explicit steps for how to test, including manual testing, and unit test/E2E test commands. If it's something visual, I try to include at least a screenshot, or sometimes even a brief screen capture demonstrating the feature.
Really go out of your way to make the reviewer's life easier. One benefit of doing all of this is that in most cases, the reviewer won't need to reach out to ask simple questions. This also helps to enable more asynchronous workflows, or distributed teams in different time zones.
Comment by Hovertruck 11 hours ago
To be fair, copilot review is actually alright at catching these sorts of things. It remains a nice courtesy to extend to your reviewer.
Comment by Waterluvian 1 hour ago
I’ll put up a draft early and use it as a place to write and refine the PR details as I wrap up, make adjustments, add a few more tests, etc.
Comment by phito 11 hours ago
Not to say you shouldn't write descriptions, I will keep doing it because it's my job. But a lot of people just don't care enough or are too distracted to read them.
Comment by simonw 10 hours ago
Comment by ffsm8 10 hours ago
Well, I'm sure you can guess what happened after that - within the same file even
Comment by skydhash 11 hours ago
Comment by lanstin 9 hours ago
Comment by simonw 9 hours ago
Comment by wiml 5 hours ago
Maybe that's the AI agent I would actually use, auto-fill those responses...
Comment by walthamstow 10 hours ago
Comment by phito 6 hours ago
Comment by simonw 11 hours ago
Comment by oceanplexian 10 hours ago
The Devs went in kicking and screaming. As an SRE it seemed like for SDEs, writing a description of the change, explaining the problem the code is solving, testing methodology, etc is harder than actually coding. Ironically AI is proving that this theory was right all along.
Comment by sodapopcan 9 hours ago
Comment by rootusrootus 9 hours ago
But putting the ticket number in the commit ... that's basically automatic, I don't know why it should be that big a concern. The branch itself gets created with the ticket number and everything follows from that, there's no extra effort.
Comment by comfydragon 2 hours ago
Only problem there is the potential for a deeply-ingrained assumption that the Jira key being in the branch name is sufficient for the traceability between the Jira issue and commits to always exist. I've had to remind many people I work with that branch names are not forever, but commit messages are.
Hasn't quite succeeded in getting everyone to throw a Jira ID in somewhere in the changeset, but I try...
Comment by cesarb 7 hours ago
That poster said "attach a JIRA Ticket to the PR", so in their case, it's not that automatic.
Comment by rootusrootus 6 hours ago
I haven't dealt with non-Atlassian tools in a while but I assume this is pretty much bog standard for any enterprise setup.
Comment by alexpotato 6 hours ago
Comment by sodapopcan 8 hours ago
Comment by p2detar 10 hours ago
One can also point QA or consultants to a ticket for documentation purposes or timeline details.
Comment by babarock 7 hours ago
If you consider that reviewer bandwidth is very limited in most projects AND that the volume of low-effort-AI-assisted PR has grown incredibly over the past year, now we have a spam problem.
Some of my engineers refuse to review a patch if they detect that it's AI-assisted. They're wrong, but I understand their pain.
Comment by wiml 5 hours ago
As a reviewer with limited bandwidth, I really don't see why I should spend any effort on those.
Comment by brooke2k 3 hours ago
I've found that this gets worse the longer the description is, and that a couple bullet points of the most important things gets the information across much better.
Comment by reactordev 11 hours ago
Comment by Swannie 2 hours ago
Oh, there are, for years :D This has really stood the test of time:
https://rfc.zeromq.org/spec/42/#24-development-process
And its rationale is well explained too:
https://hintjens.gitbooks.io/social-architecture/content/cha...
Saddened by realizing that Pieter would have had amazing things to say about AI.
Comment by bob1029 10 hours ago
This is ~mandatory for me. Even if what I am working on is non-visual. I will take a screenshot of a new type in my IDE and put a red box around it. This conveys the focus of my attention and other important aspects of the work effort.
Comment by toomuchtodo 11 hours ago
Comment by vladsh 8 hours ago
None of this is covered by code generation, nor by juniors submitting random PRs. Those are symptoms of juniors (not only) missing fundamentals. When we forget what the job actually is, we create misalignment with junior engineers and end up with weird ideas like "spec-driven development"
If anything, coding agents are a wake-up call that clarify what engineering profession is really about
Comment by newsoftheday 8 hours ago
https://read.engineerscodex.com/p/how-one-line-of-code-cause...
When 10K LOC AI PR's are being created, sometimes by people who either don't understand the code or haven't reviewed the code their trying to submit; the 60 million dollar failure line is potentially lying in wait.
Comment by tete 7 hours ago
The whole reliability, etc. to many is not of much priority. Things got an absolutely shitshow and still everyone buys it.
In other words the only outcome will be that people don't have or don't want to have engineers anymore.
Companies are very much not interested in someone who does the above, but at most someone who sells or cosplays these things - if even.
Cause that what creates income. They don't care if they sell crap, they care that they sell it and the cheaper they can produce the better. So money gets poured into marketing not quality.
High quality products are not sought after. And fake quality like putting a computer or a phone in a box like jewelry, even if you throw that very box away the next time you walk by a trash bin. That's what people consider quality these days, even if it's just a waste of resources.
And businesses choose products and services the same way as regular consumers, even when they want the marketing to make them feel good about it in a slightly different way, because marketing to your target audience makes sense. Duh!
People are ready to pay more for having the premium label stamped on to something, pay more to feel good about it, but most of the time are very unwilling to pay for measurable quality, an engineer provides.
It's scary, even with infrastructure the process seems to change, probably also due to corruption, but that's a whole other can of worms.
> communicates tradeoffs across the organization
They may do that. They may be recognized for it. But if the guy next door with the right cosplay says something like "we are professionals, look at how we have been on the market for X years" or "look at our market share" then no matter how far from reality the bullshitting is they'll be getting the money.
At the beginning of the year/end of last year I learned how little expertise, professionalism and engineering are required to be a multi billion NASDAQ stock. For months I thought that it cannot possibly be, that the core product of a such a company displays such a complete lack of expertise in the core area(s). Yet, they somehow managed to convince management to just invest a couple more times of money than the original budget that was already seen as quite the stretch. Of course they promises didn't end being anywhere close to true, and they completely forgot to inform us (our management) about severe limitations.
So if you are good at selling to management which you can be by pocketing consultants recommending you then things will work seemingly no matter what.
> If anything, coding agents are a wake-up call that clarify what engineering profession is really about
I believe what we need to wake up to or come to terms with is that our industry (everything that would go into NASDAQ) is a farce. Coding agents show that. It doesn't matter to create half-assed products if you come to sell them. You are selling your products to people. Doesn't matter if it's some guy at a hot dog stand or a CEO of a big successful company or going from house to house selling the best vacuum cleaner ever. What matters is you making people believe it would be stupid not to take your product.
Comment by order-matters 6 hours ago
I'd argue the only places left you really need low level coding fall under computer science. If you are a computer or systems engineer who needs to work with a lot of code then youll benefit from having exposure to computer science, but an actual engineering discipline for just software seems silly now. Not to mention pretty much all engineers at this point are configuring software tools on their own to some degree
I think it's similar to how there used to be horse doctors as a separate profession from vets when horses were much more prominent in everyday life, but now they are all vets again and some of them specialize in horses
Comment by chasd00 5 hours ago
the thing is, with software development, it's always been this way. Developers have just had tunnel vision for decades because they stare into an editor all day long instead trying to actually sell a product. If selling wasn't the top priority then what do you think would happen to your direct deposit? Software developers, especially software developers, live in this fantasy land where the believe their paycheck just happens automatically and always will. I think it's becoming critical that new software devs entering the workforce spend a couple years at a small, eat what you kill, consultancy or small business. Somewhere where they can see the relationship between building, selling, and their paycheck first hand.
Comment by heliumtera 1 hour ago
Comment by venturecruelty 8 hours ago
Edit: help, the new org said the same thing. :(
Edit 2: you guys, seriously, the HR lady keeps looking up at me and shaking her head. I don't think this is good. I tried to be a real, bigboy engineer, but they just mumbled something about KPIs and put me on a PIP.
Comment by rnewme 8 hours ago
Comment by tete 6 hours ago
Things break. A lot. Doctors successful or not also deal with the same shitty IT on a daily basis.
Nobody cares about engineering. It's about selling stuff, not about reliability, etc.
And to some degree one is forced to use that stuff anyways.
So sure you can go to a company understanding engineering, but if you do a job for salary you might lose out on quite a bit on it if you care for things like quality. We see this in so many different sectors.
Sure there is a unicorn here and there that makes it for a while. And they might even go big and then they sell the company or change to maximizing profits, because that's the only way up when you essentially already made it (on top of one of the big players).
For small projects/companies it depends if you have a way to still launch big, which you can usually do with enough capital. You can still make a big profit with a crappy product then, but essentially only once or twice. But then your goal also doesn't have to create quality.
Microsoft and Fortinet for example wouldn't profit from adding (much) quality. They profit from hypes. So they now both do "AI".
Comment by rnewme 3 hours ago
Comment by layer8 10 hours ago
Testing only “proves” correctness for the specific state, environment, configuration, and inputs the code was tested with. In practice that only tests a tiny portion of possible circumstances, and omits all kinds of edge and non-edge cases.
Comment by crabmusket 6 hours ago
I like using the word "demonstrates" in almost every case where people currently use the word "proves".
A test is a demonstration of the code working in a specific case. It is a piece of evidence, but not a general proof.
And these kinds of narrow ad-hoc proofs are fine! Usually adequate.
To rephrase the title of TFA, we must deliver code that is demonstrated to work.
Comment by aspbee555 9 hours ago
Comment by lanstin 9 hours ago
Comment by Nizoss 9 hours ago
When it comes to agentic coding. I created an open source tool that enforces those practices. The agent gets blocked by a hook if it tries to do anything that violates those principles. I think it helps a lot if I may say so myself.
https://github.com/nizos/tdd-guard
Edit: I realize now that I misunderstood your comment. I was quick to respond.
Comment by Yodel0914 2 hours ago
Comment by roeles 7 hours ago
Comment by array_key_first 6 hours ago
Comment by anthonypasq 7 hours ago
Comment by layer8 2 hours ago
You can’t prove that something is correct by example. Examples can only disprove correctness. And tests are always only examples.
Comment by sunsetMurk 7 hours ago
Comment by otterley 6 hours ago
That's why you have to start with specifications. See, e.g., https://martinfowler.com/articles/exploring-gen-ai/sdd-3-too...
Comment by 9rx 5 hours ago
Just 23 more times? ADD, CDD, EDD, DDD, etc.
Or maybe more?! AADD, ABDD, ACDD, ..., AAADD, AABDD, etc.
Comment by pydry 5 hours ago
As is, SDD it is some sort of AI nonsense.
Comment by otterley 5 hours ago
Comment by Yodel0914 2 hours ago
Comment by shepherdjerred 9 hours ago
Comment by crazygringo 9 hours ago
Comment by 9rx 5 hours ago
Comment by layer8 2 hours ago
Comment by user34283 10 hours ago
The rest we can figure out during testing, or maybe you even have users willing to beta-test for you.
This way, while you're still on the understanding part and reasoning over the code, your competitor already shipped ten features, most of them working.
Ok, that was a provocative scenario. Still, nowadays I am not sure you even have to understand the code anymore. Maybe having a reasonable belief that it does work will be sufficient in some circumstances.
Comment by doganugurlu 4 hours ago
How are we supposed to use software in healthcare, defense, transportation if that's the bar?
Comment by user34283 1 hour ago
You're free to review every line the model produces. Not every project is in healthcare or defense, and sometimes different standards apply.
Comment by doganugurlu 1 hour ago
I haven’t been in such a setting in 2008 so you can ignore everything I said.
But I wouldn’t want to be somewhere where people don’t test their code, and I have to write code that doesn’t break the code that was never tested until the QA cycle?
Comment by TheTxT 9 hours ago
Comment by user34283 9 hours ago
If I had a backend API that was serving user data, I'd of course check more carefully.
This kind of mistake always seemed amateurish to me.
Comment by simianwords 9 hours ago
Comment by abathur 8 hours ago
While skimming tests for the python backend, I spotted the following:
@patch.dict(os.environ, {"ENVIRONMENT": "production"})
def test_settings_environment_from_env(self) -> None:
"""Test environment setting from env var."""
from importlib import reload
import app.config
reload(app.config)
# Settings should use env var
assert os.environ.get("ENVIRONMENT") == "production"
This isn't an outlier. There are smells everywhere.Comment by stuffn 9 hours ago
LLMs can generate code that works. That much is true. You can generate sufficiently complex projects that simply run on the first (or second try). You can even get the LLM to write tests for the code. You can prompt it for 100% test coverage and it will provide you exactly what you want.
But that doesn't mean OP isn't correct. First, you shouldnt be remembering everything. If you are finding yourself remembering everything your project is either small (I'd guess less than 1000 lines) or you are overburdened and need help. Reasoning, logically, through code you write can be done JIT as you're writing the code. LLMs even suffer from the same problem. Instead of calling it "having to remember to much" we refer to it as a quantity called "context window". The only problem is the LLM won't prompt you telling you that it's context window is so full it can't do it's job properly. A human will.
I think an engineer should always be reasoning about their code. They should be especially suspicious of LLM generated code. Maybe I'm alone but if I use an LLM to generate code I will review it and typically end up modifying it. I find even prompting with something like "the code you write should be maintainable by other engineers" doesn't produce good value.
Comment by newsoftheday 8 hours ago
Comment by Swannie 1 hour ago
I know Simon follows this "Issue First" style of work in his projects, with a strong requirement for passing tests to be included.
It's been a best practice for a long time. I really enjoyed this when I read it ~10 years ago, and it still stands the test of time:
https://rfc.zeromq.org/spec/42/#24-development-process
The rationale was articulated clearly in:
https://hintjens.gitbooks.io/social-architecture/content/cha...
If you have time, do yourself a favour and read the whole lot. And then liberally copy parts of C4 into your own process. I have advocated for many components of it, in many contexts, at $employer, and will continue to do so.
Comment by doganugurlu 4 hours ago
If someone's not even interested and excited to see their code work, they are in the wrong profession.
Comment by dfxm12 12 hours ago
Is anyone else seeing this in their orgs? I'm not...
Comment by 0x500x79 11 hours ago
Unfortunately, this person is vibe coding completely, and even the PR process is painful: * The coding agent reverts previously applied feedback * Coding agent not following standards throughout the code base * Coding agent re-inventing solutions that already exist * PR feedback is being responded to with agent output * 50k line PRs that required a 10-20 line change * Lack of testing (though there are some automated tests, but their validations are slim/lacking) * Bad error handling/flow handling
Comment by nunez 7 hours ago
This is hilarious. Not when you're the reviewer, of course, but as a bystander, this is expert-level enterprise-grade trolling.
Comment by LandR 10 hours ago
Comment by 0x500x79 10 hours ago
(By my organization, I meant my company - this person doesn't report to me or in my tree).
Comment by JambalayaJimbo 1 hour ago
Comment by gardenhedge 6 hours ago
Comment by briliantbrandon 12 hours ago
Comment by lm28469 11 hours ago
Comment by jennyholzer2 11 hours ago
Comment by roblh 10 hours ago
Comment by candiddevmike 9 hours ago
Reading code sucks, it always has. The flow state we all crave is when the code is in our working memory as an understood construct and we're just translating our mental model to a programming language. You don't get that with LLMs. It devolves into prorgamming minutae equivalent to "a little to the left" but with the added complexity that "left" is hundreds of lines of code.
Comment by AstroBen 3 hours ago
If I write home-grown organic code then I have no choice but to fully understand the problem. Using an LLM it's very easy to be lazy, at least in the short term
Where does that get me after 3 months? I end up working on a codebase I barely understand. My own skills have degraded. It just gets worse the longer you go
This is also coming from my experience in the best case scenario: I enjoy coding and am working on something I care about the quality of. Lots of people don't have even that
Comment by dsego 11 hours ago
Comment by bluGill 11 hours ago
Comment by jennyholzer2 10 hours ago
the idea that LLMs make developers more productive is delusional.
Comment by coffeebeqn 4 hours ago
Comment by jennyholzer2 11 hours ago
Comment by square_usual 10 hours ago
Comment by jennyholzer2 10 hours ago
Comment by swah 7 hours ago
Comment by chasd00 4 hours ago
Now the power to create tons and tons of code (ie content) is in the hands of everyone and here we are complaining about it just like my wife use to complain about journalism. I think the myth of the highly regarded Software Developer perched in front of the warming glow of a screen solving and automating critical problems is coming to an end. Deservedly really, there's nothing more special about typing words into an editor than, say, framing a house. The novelty is over.
Comment by lunar_mycroft 9 hours ago
Comment by dfxm12 11 hours ago
Comment by briliantbrandon 11 hours ago
I think this is largely an issue that can be solved culturally within a team, we just unfortunately only have so much input on how other teams work. It doesn't help either when their manager doesn't seem to care about the feedback... Corporate politics are fun.
Comment by dfxm12 11 hours ago
Comment by jennyholzer2 11 hours ago
If you are sufficiently motivated to appear more "productive" than your coworkers, you can force them to review thousands of lines of incorrect AI slop code while you sit back and mess around with your chatbots.
Your coworkers no longer have enough time to work on their in-progress PRs, so you can dominate the development team in terms of LOC shipped.
Understand that sociopaths are skilled at navigating social and bureaucratic environments. A sociopath who ships the most LOC will get the promotion every single time.
Comment by andy99 11 hours ago
Comment by heliumtera 11 hours ago
Comment by zx2c4 11 hours ago
https://github.com/WireGuard/wireguard-android/pull/82 https://github.com/WireGuard/wireguard-android/pull/80
In that first one, the double pasted AI retort in the last comment is pretty wild. In both of these, look at the actual "files changed" tab for the wtf.
Comment by newsoftheday 9 hours ago
Comment by IshKebab 10 hours ago
I recently reviewed a PR that I suspect is AI generated. It added a function that doesn't appear to be called from anywhere.
It's shit because AI is absolutely not on the level of a good developer yet. So it changes the expectation. If a PR is not AI generated then there is a reasonable expectation that a vaguely competent human has actually thought about it. If it's AI generated then the expectation is that they didn't really think about it at all and are just hoping the AI got it right (which it very often doesn't). It's rude because you're essentially pawning off work that the author should have done to the reviewer.
Obviously not everyone dumps raw AI generated code straight into a PR, so I don't have any problem with using AI in general. But if I can tell that your code is AI generated (as you easily can in the cases you linked), then you've definitely done it wrong.
Comment by fnands 12 hours ago
Probs fine when you are still in the exploration phase of a startup, scary once you get to some kind of stability
Comment by ryandrake 11 hours ago
Hell, for my hobby projects, I try to keep individual commits under 50-100 lines of code.
Comment by bonesss 7 hours ago
If these AIs are so smart, why the giant LOCs?
Sure, it’s cheaper today than yesterday to write out boilerplate, but programming is about eliminating boilerplate and using more powerful abstractions. It’s easy to save time doing lots of repetitive nonsense, stopping the nonsense should be the point.
Comment by coffeebeqn 4 hours ago
Comment by peab 10 hours ago
Comment by tossandthrow 11 hours ago
Comment by pjc50 10 hours ago
Comment by jimbohn 10 hours ago
Comment by titzer 11 hours ago
Comment by jennyholzer2 11 hours ago
Comment by titzer 11 hours ago
Comment by davey48016 11 hours ago
Comment by tossandthrow 11 hours ago
Comment by fennecfoxy 11 hours ago
Developers aren't hired to write code that's never run (at least in my opinion). We're also responsible for running the code/keeping it running.
Comment by jennyholzer2 11 hours ago
Comment by Ekaros 10 hours ago
And if it was repeated... Well I would probably get fired...
Comment by insin 9 hours ago
And not just from juniors
Comment by gardenhedge 5 hours ago
Comment by stackskipton 11 hours ago
My eyes were wide open when 2 jobs ago, they said they would be blocking all personal web browsing from work computers. Multiple Software Devs were unhappy because they were using their work laptop for booking flights, dealing with their kids schools stuff and other personal things. They did not have personal computer at all.
Comment by nutjob2 9 hours ago
Comment by stackskipton 7 hours ago
Comment by mrkeen 8 hours ago
I.e. 1-2 times a month, there's an SQL script posted that will be run against prod to "hopefully fix data for all customers who were put into a bad state from a previous code release".
The person who posts this type of message most often is also the one running internal demos of the latest AI flows and trying to get everyone else onboard.
Comment by kaffekaka 11 hours ago
Comment by peab 10 hours ago
Comment by jennyholzer2 12 hours ago
Comment by iamflimflam1 10 hours ago
Comment by JambalayaJimbo 1 hour ago
Comment by zahlman 11 hours ago
Comment by nbaugh1 10 hours ago
Comment by ncruces 9 hours ago
Comment by Yodel0914 2 hours ago
Comment by bluGill 11 hours ago
People do what they think they will be rewarded for. When you think your job is to write a lot of code then LLMs are great. When you need quality code you start to ask if LLMs are better or not?
Comment by eudamoniac 10 hours ago
Comment by nunez 7 hours ago
Comment by ncruces 9 hours ago
Fully vibe coded, which at least they admitted. And when I pointed out the thing is off by an order of magnitude, and as such doesn't implement said feature — at all — we get pressed on our AI policy, so as to not waste their time.
I don't have an AI policy, like I don't have an IDE policy, but things get ridiculous fast with vibe coding.
Comment by neutronicus 10 hours ago
But LLMs don't really perform well enough on our codebase to allow you to generate things that even appear to work. And I'm the most junior member of my team at 37 years of age, hired in 2019.
I really tried to follow the mandate from on high to use Copilot, but the Agent mode can't even write code that compiles with the tools available to it.
Luckily I hooked it up to gptel so I can at least ask it quick questions about big functions I don't want to read in emacs.
Comment by notpachet 2 hours ago
This sounds fucking awesome.
Comment by hexbin010 11 hours ago
Comment by endemic 10 hours ago
Comment by bdangubic 11 hours ago
Comment by x3n0ph3n3 11 hours ago
Comment by dfxm12 11 hours ago
Comment by wizzwizz4 12 hours ago
Comment by lm28469 11 hours ago
You could intuitively think it's just a difference of degree, but it's more akin to a difference of kind. Same for a nuke vs a spear, both are weapons, no one argues they're similar enough that we can treat them the same way
Comment by nunez 7 hours ago
Comment by evilduck 11 hours ago
Comment by 1-more 10 hours ago
Comment by notpachet 2 hours ago
Comment by troyvit 11 hours ago
Comment by jennyholzer2 10 hours ago
LLMs can't do this.
Your code is unambiguously better than any LLM code if you can comment a link to the stackoverflow post you copied it from.
Comment by newsoftheday 8 hours ago
So, I'm agreed on the second part too then.
Comment by lcnPylGDnU4H9OF 9 hours ago
This is not a truism. "My" code might come from an LLM and that's fine if I can be reasonably confident it works. I might try to gain that confidence by testing the code and reading it to understand what it's doing. It is also true of blog post code, regardless of how I refer to the code; if I link to the blog post, it's because it does a better job of explaining than I ever could in code comments. Whether LLMs make one more productive is hard to measure but it seems to be missing the point to write this.
The point is, including the code is a choice and one should be mindful of it, no matter the code's origin. At that point, this comes off like you just have something to prove; there doesn't seem to be a reason not to use the LLM code if you know it works and you know why it works.
Comment by wizzwizz4 1 hour ago
Comment by bgwalter 10 hours ago
Comment by trevor-e 10 hours ago
Strong disagree here, your job is to deliver solutions that help the business solve a problem. In _most_ cases that means delivering code that you should be able to confidently prove satisfies the requirements like the OP mentioned, but I think this is an important nitpick distinction I didn't understand until later on in my career.
Comment by newsoftheday 9 hours ago
Comment by casey2 2 hours ago
Comment by tech-ninja 8 hours ago
That is an insane distinction that you are trying to do there. In which cases delivering code that doesn't satisfy the requirements would solve a business problem?
Comment by adrianmonk 6 hours ago
So then, in a month you can either develop 10 features that definitely work or 20 features that have a 75% chance of working. Which one of these delivers more value to your business?
That depends on a lot of things, like the severity of the consequences for incorrect software, the increased chaos of not knowing what works and what doesn't, the value of the features on the list, and the morale hit from slowly driving your software engineers insane vs. allowing them to have a modicum of pride in their work.
Because it's so complex, and because you don't even have access to all the information, it's hard to actually say which approach delivers more value to the business. But I'm sure it goes one way some of the time and the other way other times.
I definitely prefer producing software that I know works, but I don't think that it's an absurd idea the other way delivers more business value in certain cases.
Comment by claar 8 hours ago
Solving the business need has precedence over technical correctness.
Satisfying "what I think the requirements are" without considering the business need causes most code rework in my experience.
Comment by p2detar 5 hours ago
Comment by trevor-e 4 hours ago
Many times in my career, after understanding the problem at hand and who initiated it, I realized the solution is actually one of:
1) a people/organizational problem, not technical 2) doesn't make sense to code a complicated system when it could be a simple Google Sheet 3) the person actually has a completely different problem 4) we already have a solution they didn't know about
My issue with the OP is that it highly emphasizes delivering code. We are not meant to be code monkeys, we are solving problems at the end of the day. Many people I've met throughout my career forget that and immediately jump into writing code because they think that's their job.
Comment by nrhrjrjrjtntbt 7 hours ago
Comment by theshrike79 5 hours ago
We're not talking about making a calculator that can't calculate 1+1. This might be a website that's a bit slow and janky to use.
25% of users go away because it's shit, but 75% stay. And it would've too much effort to push the jank to zero and retain a 100%.
A website that takes juuuust too long to load still "satisfies requirements" in most cases, especially when making loading instant carries a significant extra cost the customer isn't willing to pay for.
Comment by antod 7 hours ago
Comment by sharkjacobs 8 hours ago
Comment by trevor-e 3 hours ago
> The job is to help the business solve a problem, not just to ship code. In cases where delivering code actually makes sense, then yeah you should absolutely be able to prove it works and meets the requirements like the OP says. But there are plenty of cases where writing code at all is the wrong solution, and that’s an important distinction I didn’t really understand until later in my career.
Although funnily enough, the meaning you interpreted also has its own merit. Like other commenters have mentioned, there's always a cost tradeoff to evaluate. Some projects can absolutely cut corners to, say, ship faster to validate some result or gain users.
Comment by SoftTalker 7 hours ago
Comment by ambicapter 6 hours ago
Comment by nrhrjrjrjtntbt 7 hours ago
Sure. That is every job though. It is interesting to muse on. Hard for us to solve a problem without a computer (or removing one!)
Comment by gardenhedge 6 hours ago
"Your job is to deliver technical solutions that help the business solve a problem"
Where the word technical does the work of representing your skill set. That means you won't be asked to create a marketing campaign (solution) to help the business sell more product (problem).
Comment by doganugurlu 58 minutes ago
My priorities are as follows: 1) code works 2) code satisfies requirements
Not sure how anyone can prove their code satisfies requirements when it doesn’t run.
Comment by draw_down 10 hours ago
Comment by mapontosevenths 11 hours ago
Manual and automatic testing are still both required, but you must explicitly ensure that security considerations are included in those tests.
The LLM doesn't care. Caring is YOUR job.
Comment by andy99 12 hours ago
I vibe code a lot of stuff for myself, mostly for viewing data, when I don’t really need to care how it works. I’m coming around to the idea that outside of some specific circumstances where everyone has agreed they don’t need to care about or understand the code, team vibe coding is a bad practice.
If I’m paying an engineer, it’s for their work, unless explicitly agreed otherwise.
I think vibe coding is soon going to be seen the same way as “research” where you engage an offshore team (common e.g. in consulting) to give you a rundown on some topic and get back the first five google search results. Everyone knows how to do that, if it’s what they wanted they wouldn’t be hiring someone to do it.
Comment by simonw 12 hours ago
Comment by doganugurlu 12 minutes ago
The second time it happens they gotta go.
I would find the expectation that I need to attach a screenshot insulting. And the understanding that my peers test their code to produce a screenshot would be pretty demoralizing.
Comment by JambalayaJimbo 58 minutes ago
Comment by Nizoss 9 hours ago
Comment by JoeAltmaier 11 hours ago
That's why I refuse to take part in it. But I'm an old-world craftsman by now, and I understand nobody wants to pay for working, well-thought-out code any more. They don't want a Chesterfield; they want plywood and glue.
Comment by whattheheckheck 10 hours ago
Comment by AlienRobot 10 hours ago
Comment by redwall_hp 6 hours ago
I'm starting to be in favor of professional licensing for software engineering.
Comment by gadflyinyoureye 11 hours ago
Comment by JoeAltmaier 9 hours ago
Comment by johnea 4 hours ago
You and me both, and for many of the same reasons.
I would point out that in your OPs comment, Luddites get the stereotypical dismissal as anti-tech, which is far far from the reality of demanding good conditions for workers.
For the modern s/w engineer, being granted the time and resources for adequate testing could be considered a "worker's rights" issue. In that context the Luddite allegation could be accurate.
My comment is largely along the same lines:
Comment by acituan 8 hours ago
The root cause is the second problem; short of formal verification you can never exhaustively prove that your code works. You can demonstrate and automate that demonstration for a sensible subset of inputs and states and hope for the state of the world approximately staying that way (spoiler: it won't). This is why 100% test coverage in most cases is something bad. This is why sensible is the key operative attitude, which LLM suck at right now.
The root cause of that one is the third problem; your job is to solve a business problem. If your code is not helping the business problem, it actually is not working in the literal sense of the work. It is an artifact that does a thing, but it is not doing work. And since you're downstream of all the self-contradicting, ever changing requirements in a biased framing of a chaotic world, you can never prove or demonstrate that your code solves a business problem and that is the end state.
Comment by ChrisMarshallNY 5 hours ago
In fact, if any bugs were found by the official "last step" QA Department, we (as a software development department) were dinged. If QA found bugs, they could stop the entire product release, so you did not want to be responsible for that.
This resulted in each software development department setting up their own, internal "QC team" of testers. If they found bugs, then individual programmers (or teams) would get dinged, but the main department would not.
Our software got a lot of testing.
Comment by jobs_throwaway 5 hours ago
Comment by ChrisMarshallNY 4 hours ago
And yes. It was a strong disincentive to making changes.
I didn't really like it, but our software did do what it said on the tin (which wasn't always ideal).
Comment by agentultra 11 hours ago
A colleague was working on an important subsystem and would ask Djikstra for a review when he thought it was ready. Djikstra would have to stop what he was doing, analyze the code, and would find a grievous error or edge case. He would point it out to the colleague who would then get back to work. The colleague would submit his code for review again and this could carry on enough times that Djikstra got annoyed.
Djikstra proposed a solution. His colleague would have to submit with his code some form of proof or argument as to why it was correct and ready to merge. That way Djikstra could save time by only having to review the argument and not all of the code.
There’s a way of looking at LLM output as Djikstra’s colleague. It puts a lot of burden on the human using this tool to review all of the code. I like Doctorow’s mental model of a reverse centaur. The LLM cannot reason and so won’t provide you with a sound argument. It can probably tell you what it did and summarize the code changes it made… but it can’t decide to merge those changes. It needs a human, the bottom half of the centaur, to do the last bit of work here. Because that’s all we’re doing when we let these tools do most of the work for us: we’re here to take the blame.
And all it takes is an implementation of what we’re trying to build already, every open source library ever, all of SO, a GW of power from a methane power plant, an Olympic pool of water and all of your time reviewing the code it generates.
At the end of the day it’s on you to prove why your changes and contributions should be merged. That’s a lot of work! But there’s no shortcuts. Luckily you can reason while the LLMs struggle with that so use it while you can when choosing to use such tools.
Comment by newsoftheday 7 hours ago
Anyone who allows a 10K LOC LLM generated PR to be merged without reviewing every single line, is doing the same thing, a coin toss.
Comment by agentultra 4 hours ago
At least a liar is trying to deceive you. Vizzini’s entire exercise is moot.
Comment by vcarrico 5 hours ago
I'm noticing something else very similar but involving not necessarily junior roles with long messages, when they use these AI writing assistants that resume stuff, creates follow-ups, etc. Putting this additional burden in whoever needs to read it. It makes me think of a quote that says: "I would have written a shorter letter, but I didn't have the time."
Comment by keeda 4 hours ago
A bit clunky, but I think that can be scaled from individual lines of code to features or entire systems, whatever you are responsible for delivering, and encompasses all the processes that go around figuring what code is to be actually written and making sure it does what it's supposed to.
Trust and accountability are absolutely a critical aspect of software engineering and the code we deliver. Somehow that is missed in all the discussions around AI-based coding.
The whole phenomenon of AI "workslop" is not a problem with AI, it's a problem with lack of accountability. Ironically, blaming workslop on AI rather than organizational dysfunction is yet another instance of shirking accountability!
Comment by 0xbadcafebee 10 hours ago
Therefore you must verify it works as intended in the real world. This means not shipping code and hoping for the best, but checking that it actually does the right thing in production. And on top of that, you have to verify that it hasn't caused a regression in something else in production.
You could try to do that with tests, but tests aren't always feasible. Therefore it's important to design fail-safes into your code that ALERT YOU to unexpected or erroneous conditions. It needs to do more than just log an error to some logging system you never check - you must actually be notified of it, and you should consider it a flaw in your work, like a defective pair of Nikes on an assembly line. Some kind of plumbing must exist to take these error logs (or metrics, traces, whatever) and send it to you. Otherwise you end up producing a defective product, but never know it, because there's nothing in place to tell you its flaws.
Every single day I run into somebody's broken webapp or mobile app. Not only do the authors have no idea (either because they aren't notified of the errors, or don't care about them), there is no way for me to even e-mail the devs to tell them. I try to go through customer support, a chat agent, anything, and even they don't have a way to send in bug reports. They've insulated themselves from the knowledge of their own failures.
Comment by cynicalsecurity 10 hours ago
Who popped this balloon? I know I need to change my employer, but it's not so easy. And I'm not sure another employer is going to be any better.
Comment by mystifyingpoi 8 hours ago
Comment by roryirvine 9 hours ago
Comment by theshrike79 5 hours ago
Boss: sucks in air through his teeth "Best I can do is one week. Get to it."
Me, with a massive mortgage and the job market is shit: "Rogerroger, bossman"
Comment by asadotzler 8 hours ago
Comment by 0x500x79 10 hours ago
Overall, this hits the nail on the head about not delivering broken code and providing automated tests. Thanks for putting your thoughts on paper.
Comment by nzoschke 11 hours ago
I’m experimenting with how to get these into a PR, and the “gh” CLI tool is helpful.
Does anyone have a recipe to get a coding agent to record video of webflows?
Comment by simonw 11 hours ago
Comment by a24venka 5 hours ago
It often takes discipline to think and completely map out solutions before you build. This is where experience and knowing common patterns can also help.
When you have the experience of having manually written or read a lot of code it helps at the very least quickly understand what the LLMs are writing and reason about it later even if not at the beginning.
Comment by holtkam2 6 hours ago
Otherwise you’ll end up in situations where it passes all test cases yet fails for something unexpected in the real world, and you don’t know why, because you don’t even know what’s going on under the hood.
Comment by onion2k 10 hours ago
Comment by ericmcer 10 hours ago
When someone pings me for a review and their code isn't even passing CI builds/tests I just let them know its failing and don't even look at a line of their code.
Comment by simonw 9 hours ago
Yeah, I'm a bit sad I felt the need to write this to be honest.
Comment by cyrialize 10 hours ago
As I figure out my manual testing, I'll write out the steps that I took in my PR.
I've found that writing it out as I go does two things: 1) It makes it easier to have a detailed PR and 2) it acts as a form of rubber-ducking. As I'm updating my PR I'll realize steps I've missed in my testing.
Something that also helped out with my manual testing skill was working in a place that had ZERO automated testing. Every PR required a detailed testing plan that you did and that your reviewer could re-create.
Comment by tete 7 hours ago
It's not my job, really. And given by the state of IT these days it's barely anyone's it seems.
Comment by allcentury 12 hours ago
Outside in testing is great but I typically do automated outside in testing and only manual at the end. The loop process of testing needs to be repeatable and fast, manual is too slow
Comment by simonw 12 hours ago
I've lost count of the number of times I've skipped it because the automated test passed and then found there was some dumb but obvious bug that I missed, instantly exposed when I actually exercised the feature myself.
Comment by robryk 11 hours ago
Comment by pjc50 10 hours ago
There's a lot of pedantry here trying to argue that there exists some feature which doesn't need to be "manually" tested, and I think the definition of "manual" can be pushed around a lot. Is running a program that prints "OK" a manual test or not? Is running the program and seeing that it now outputs "grue" rather than "bleen" manual? Does verifying the arithmetic against an Excel spreadsheet count?
There are programs that almost can't be manual, and programs that almost have to be manual. I remember when working on PIN pad integration we looked into getting a robot to push the buttons on the pad - for security reasons there's no way of injecting input automatically.
What really matters is getting as close to a realistic end user scenario as possible.
Comment by simonw 11 hours ago
Comment by bluGill 11 hours ago
Comment by 9rx 11 hours ago
[1] As far as I can tell. If there are good solutions for this too, I'd love to learn.
Comment by RaftPeople 10 hours ago
Unit testing, whether manual or automated, typically catches about 30% of bugs.
End to end testing and visual inspection of code are both closer to 70% of bugs.
Comment by 9rx 9 hours ago
Of course that is not a panacea. What can happen in the real world is not truly understanding what the software needs to do. That can result in the contract not being aligned with what the software actually needs. It is quite reasonable to call the outcome of that "bugs", but tests cannot catch that either. In that case, the tests are where the problem lies!
Most aspects of software are pretty clear cut, though. You can reasonably define a full contract without needing to see it. UX is a particular area where I've struggled to find a way to determine what the software needs before seeing it. There is seemingly no objective measure that can be applied in determining if a UX is going to spark joy in order to encode that in a contract ahead of time. Although, as before, I'm quite interested to learn about how others are solving that problem as leaving it up to "I'll know it when I see it" is a rather horrible approach.
Comment by ozim 7 hours ago
Your job is to the deliver code up to specifications.
Not even checking the happy flow at least is of course gross negligence. But so is spending too much time on edge cases that no one will run into or person asking doesn’t want to pay for covering.
Comment by golly_ned 5 hours ago
The submitter should understand how it works and be able to 'own' and review modifications to it. That's cognitive work submitters ipso facto don't do by offloading the understanding to an LLM. That's the actual hard work reviewers and future programmers have to do instead.
Comment by zhyder 8 hours ago
I'd go further: what's valuable is code review. So review the AI agent's code yourself first, ensuring not only that it's proven to work, but also that it's good quality (across various dimensions but most importantly in maintainability in future). If you're already overwhelmed by that thousand-line patch, try to create a hundred-line patch that accomplishes the same task.
I expect code review tools to also rapidly change, as lines of code written per person dramatically increase. Any good new tools already?
Comment by rmnclmnt 11 hours ago
That’s the thing. People exposing such rude behavior usually are not, or haven’t been in a looong time…
As for the local testing part not being performed, this is a slippery slope I’m fighting everyday: more and more cloud based services and platforms are used to deploy software to run with specific shenanigans and running it locally requires some kind of deep craft and understanding. Vendor lock-in is coming back in style (e.g. Databricks)
Comment by simonw 11 hours ago
The best solution I have for that is staging environments, ideally including isolated-from-production environments you can run automated tests against.
Comment by skydhash 11 hours ago
Comment by rmnclmnt 9 hours ago
But it requires some advanced local testing setup and knowledge to do so, hence my initial remark on this type of developers not being real professionals in the first place…
Comment by WhyOhWhyQ 9 hours ago
My takeaway from your blog post yesterday was that with a robust enough testing system the LLM can do the entire thing while I do Christmas with the family.
(Before all the AI fans come in here. I'm not criticizing AI.)
Comment by simonw 8 hours ago
Comment by BeefySwain 9 hours ago
Does this guarantee that it functions completely with no errors whatsoever? Certainly not. You need formal verification for that. I don't think that contradicts what Simon was advocating for though in this post.
Comment by WhyOhWhyQ 9 hours ago
Comment by ncruces 8 hours ago
They're called programing languages, and a deterministic algorithm translates them to machine code.
Are we sure English and a probabilistic algorithm is any better at this?
Comment by WhyOhWhyQ 8 hours ago
Comment by sowbug 8 hours ago
I'd buttress this statement with a nuance. Automated tests typically run in their entirety, usually by a well-known command like cargo test or at least by the CI tools. Manual tests are often skipped because the test seems to be far away from the code being changed.
My all-time favorite team had a rule that your code didn't exist if it didn't have automated tests to "defend" it. If it didn't, it was OK, or at least not surprising, for someone else to break or refactor it out of existence (not maliciously, of course).
Comment by visarga 11 hours ago
My approach to coding agents is to prepare a spec at the start, as complete as possible, and develop a beefy battery of tests as we make progress. Yesterday there was a story "I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in hours". They had 9000+ tests. That was the secret juice.
So the future of AI coding as I see it ... it will be better than pre-2020, we will learn to spec and plan good tests, and the tests are actually our contract the code does what is supposed to do. You can throw away the code and keep the specs and tests and regenerate any time.
Comment by smokel 11 hours ago
Comment by visarga 11 hours ago
For UIs I do a different trick - live diagnostic tests - I ask the agent to write tests that run in the app itself, check consistencies, constraints and expected behaviors. Having the app running in its natural state makes it easier to test, you can have complex constraints encoded in your diagnostics.
Comment by zahlman 11 hours ago
Yes, from the same author, in fact.
Comment by paganel 11 hours ago
> They had 9000+ tests.
They were most probably also written by AI, there's no other (human) way. The way I see it we're putting turtles upon turtles hoping that everything will stick together, somehow.
Comment by simonw 10 hours ago
Comment by pjc50 10 hours ago
Behind that is a smaller number of larger integration tests, and the even longer running regression tests that are run every release but not on every commit.
Comment by zahlman 11 hours ago
Yes. They came from the existing project being ported, which was also AI-written.
Comment by am17an 10 hours ago
Comment by dekhn 8 hours ago
I prefer to make this probabilistic: use testing to reduce the probability that your code isn't correct, for the situations in which it is expected to be deployed. In this sense, coding and testing is much like doing experimental physics: we never really prove a theory or disprove it, we just invalidate clearly wrong ones.
Comment by newsoftheday 8 hours ago
Comment by geldedus 12 hours ago
Comment by kords 9 hours ago
Comment by webprofusion 2 hours ago
Comment by softwaredoug 10 hours ago
Claude, etc, works best with good tests that verify the system works. And so the code becomes in some ways the tests rather than the code that does the thing. If you're responsible for the thing, then 90% of your responsibility moves to verifying behavior and giving agents feedback.
Comment by maerF0x0 9 hours ago
Depending on exactly what the author meant here, I disagree. Our first and default tool should be some form of lightweight automated testing. It's explicit (serves a form of spec and docs how to use the software), it's repeatable (manual testing is done once and it's result is invalidated moments later), and it's cost per minute of effort is more or less the same (most companies have the engineers do the tests, they are expensive).
Yes. There will be exceptions and exceptional cases. This author is not talking about exceptions and neither am I. They're not an interesting addition to this conversation.
Comment by IMTDb 8 hours ago
Manual verification isn't about skipping tests, it's about validating what to test in the first place.
You need to see the code work before you know what "working" even means. Does the screen render correctly? Does the API return sensible data? Does the flow make sense to users? Automated tests can only check what you tell them to check. If you haven't verified the behavior yourself first, you're just encoding your assumptions into test cases.
I'd take "no tests, but I verified it works end-to-end" over "full test coverage, but never checked if it solves the actual problem" every time. The first developer is focused on outcomes. The second is checking boxes.
Tests are crucial: they preserve known-good behavior but you have to establish what "good" looks like first, and that requires human judgment. Automate the verification, not the discovery. So our first and default tool remains manual verification
Comment by codeviking 8 hours ago
Automated tests omit a certain type of feedback that I think remains important to the development loop. Automation doesn't care about a poor UX; it only verifies what you tell it to.
For instance, I regularly contribute to a CLI that's widely used at $WORK. I can easily write tests to verify the I/O of a command I'm working on that assert correctness. Yet if I actually try to use the command I'm changing, usually as part of verifying my changes, I tend to discover usability issues that make the program more pleasant to use and the tests would happily ignore.
Also, there's certainly cases where automation isn't worth the cost. Maybe because the resulting tests are complex, or brittle. I've often found UI tests to lie in this category (but maybe I'm doing them wrong).
Because of these things I think manual testing is the right default. Automated tests should also exist; but manual tests should _always_ be part of the process.
Comment by tech-ninja 8 hours ago
Unless you are working on a tiny change on a highly tested part of the code you should be manually testing your code and/or adding some tests.
Comment by alexgotoi 9 hours ago
Nobody sane expects reviewers to babysit untested mega-PRs - that's not their job. The dirty secret is that good tests (unit, integration, property-based if you're feeling fancy) + a pre-commit CI gate are what actually prove code works before it ever hits anyone's inbox. AI makes those cheaper and faster to write, not optional.
Dropping untested behemoths and expecting the team to clean up hallucinations isn't shipping. It's just making everyone's Thursday worse.
Will include this one in my next https://hackernewsai.com/ newsletter.
Comment by minimaxir 8 hours ago
Comment by alexgotoi 6 hours ago
Comment by thopkinson 9 hours ago
Accountability is the real answer. If you don't enable individual and team accountability, then you are part of the problem.
Comment by koinedad 6 hours ago
I remember when I was working at a startup and a new engineer merged his code and it totally broke the service. I asked him if he ran his code locally first and he stared at me speechless.
Running the code locally is the easiest way to eliminate a whole series of silly bugs.
Like mentioned in the article adding a test and then reverting your change to make sure the test fails is really important, especially with LLMs writing tests. They are great at making things look like they work but completely don’t.
Comment by funkattack 11 hours ago
Comment by newsoftheday 7 hours ago
And...code that has been 100% reviewed, even if it was fully LLM generated.
Comment by weatherlite 11 hours ago
That's really not a great development for us. If our main point is now reduced to accountability over the result with barely any involvement in the implementation - that's very little moat and doesn't command a high salary. Either we provide real value or we don't ...and from that essay I think it's not totally clear what the value is - it seems like every QA, junior SWE or even product manager can now do the job of prompting and checking the output.
Comment by simonw 11 hours ago
Experienced software engineers have such a huge edge over everyone else with this stuff.
If your product manager doesn't understand what a CORS header is good luck having them produce a change that requires cross-domain fetch() call... and first they'll have to know what a "cross-doman fetch() call" means.
And sure they could ask an LLM about that, but they still need the vocabulary and domain knowledge to get to that question.
Comment by falcor84 11 hours ago
Comment by ianberdin 8 hours ago
The point is to hire people who can own code and codebase. “Someone will review” is dead end.
Comment by yuedongze 10 hours ago
Comment by enraged_camel 11 hours ago
I would go a step further: we need to deliver code that belongs. This means following existing patterns and conventions in the codebase. Without explicit instruction, LLMs are really bad at this, and it's one of the things that make it incredibly obvious to reviews that a given piece of code has been generated by AI.
Comment by 0x500x79 10 hours ago
I also see AI coding tools violate "Chesterton's Fence" (and the pre-Chesterton's Fence, not sure what that is called, the idea being that code is necessary otherwise it shouldn't be in the source).
Comment by 9rx 10 hours ago
They used to be. They have become quite good at it, even without instruction. Impressively so.
But it does require that the humans who laid the foundation also followed consistent patterns and conventions. If there is deviation to be found, the LLM will see it and be forced to choose which direction to go, and that's when things quickly fall off the rails. LLMs are not (yet) good at that, and maybe never can be as not even the humans were able to get it right.
Garbage in, garbage out, as they say.
Comment by rglover 10 hours ago
This only happens because the software industry has fallen into the Religion of Speed. I see it constantly: justified corner-cutting, rushing shit out the door, and always loading up another feature/project/whatever with absolutely zero self-awareness. AI is just an amplifier for bad behavior that was already causing chaos.
What's not being said here but should be: discipline matters. It's part of being a professional and always precedes someone who can ship code that "just works."
[1] https://ia.net/*
Comment by simonw 10 hours ago
Comment by Nizoss 8 hours ago
Comment by lifeisstillgood 7 hours ago
along with
- the job was better titled as “Analyst Programmer” - you need both.
And
- you can make a changeset, but you have to also sell the change
Comment by dangus 9 hours ago
[1] I.e., it should work
That may seem pedantic but that’s a huge difference. Code is a means to an end. If no-code suddenly became better than code through some miracle, that would be your job.
This also means that if one day AI stops making mistakes, tossing AI requests over the wall may be a legitimate modus operandi.
Comment by acrophiliac 10 hours ago
Comment by simonw 10 hours ago
That's not something I've seen or been able to achieve in most of my professional work.
Comment by gaigalas 11 hours ago
Agents love to cheat. That's an issue I don't see a horizon for change.
Here's Opus 4.5 trying to cheat its way out of properly implementing compatibility and cross-platform, despite the clear requirements:
https://gist.github.com/alganet/8531b935f53d842db98157e1b8c0...
> Should popen handles work with fgets/fread/fwrite? PHP supports this. Option A: Create a minimal pipe_io_stream device / Option B: Store FILE* in io_private with a flag / Option C: Only support pclose, require explicit stream wrapper for reads.
If I asked for compatibility, why give me options that won't fully achieve it?
It actually tried to "break check" my knowledge about the interpreter (test me if I knew enough to catch it), and proposed shortcuts all the way through the chat.
I don't want to have to pepper my chats with variations on "don't cheat". I mean, I can do it, but it seems like boilerplate.
I wish I had some similar testing-related chats to share. Agents do that all the time.
This is the major blocker right now for AI-assisted automated verification, and one of the reasons why this isn't well developed beyond general directions (give it screenshots, make it run the command, etc).
Comment by casey2 3 hours ago
How to prove it has been subject to some debate for the past century, the answer is that it's context dependent to what degree you will or even can prove the program and exposed identifiers correct. Programming is a communication problem as well as a math problem, often an engineering problem too. Only the math portion can be proved, the a small by critical amount engineering portion tested.
Communication is the most important for velocity it's the difference between hand rolling machine code and sshing into a computer halfway across the world having every tool you expect. If you don't trust that webdevs know what they are doing then you can be the most amazing dev in the world you but your actual ability to contribute will be hampered. The same is true of vibe coding, if people aren't on the same page as to what is and isn't acceptable velocity starts to slow down.
Languages have not caught up to AI tools, since AI operates well above the function level, what level would be appropriate to be named and signed off on? pull request and link to the chat as a commit? (what is wrong with that that could be fixed at the name level)
Honest communication is the most important. Amazon telling investors that they use TLA+ is just signaling that they "for realz take uptime very seriously guize", "we know distributed systems" and engineering culture. The honest reality is that they could prove all their code and not IMprove their uptime one lick, because most of what they run isn't their code. It's a communication breakdown if effort gets spent on that outside a research department.
Comment by givemeethekeys 9 hours ago
Comment by gorjusborg 7 hours ago
If you are dumping AI slop on your team to sort through, you are creating drag on the entire team's efforts toward those positive outcomes.
As someone getting dumped upon, you probably should make the decision (in line with the objective to producing positive outcomes) to not waste your time weeding through that stuff.
Review everything else, make it clear that the mess is not reviewable, and communicate that upward if needed.
Comment by mellosouls 10 hours ago
The title doesn't go far enough - slop (AI or otherwise) can work and pass all the tests, and still be slop.
Comment by simonw 10 hours ago
If it doesn't even work you're absolutely wasting my time with it.
Comment by theshrike79 5 hours ago
To get the maximum ROI from LLM-assisted programming it needs proper unit tests, integration tests, correctly configured linters, accessible documentation and well-managed git history (Claude actually checks git history nowadays to see when a feature was added if it has a bug)
Worst case we'll still have proper tests and documentation if the AI bubble suddenly bursts. Best case we can skip the boring bits because the LLM is "smart" enough to handle the low hanging fruit reliably because of the robust test suite.
Comment by t1234s 8 hours ago
Comment by johnea 5 hours ago
If you, the development engineer, haven't demonstrated the product to work as expected, and preferably this testing is independently confirmed by a product test group, then you can't claim to be delivering a functional product.
I would add though, that management, specifically marketing management setting unreasonable demands and deadlines, are a bigger threat to testing than LLMs.
Of course the damage done by LLM generated code not being tested, is additive to the damage management is doing.
So this isn't any kind of apologism, the two sources are both making the problem worse.
Comment by nrhrjrjrjtntbt 7 hours ago
Comment by nish__ 10 hours ago
Comment by llm_nerd 9 hours ago
Kind of depressing how it has become such a trope of blaming juniors for every ill or bad habit. In all likelihood the reader of this comment has a number of terrible habits, working on teams with terrible habits, and juniors play zero part in it.
And, I mean, on that theme developers have been doing this for as long as we've had large teams. I've worked at a large number of teams where there was the fundamental principal that QA / UA holds responsibility. That they are responsible for tests, and they are responsible for bad code making it through to the product / solution. Developers -- grizzled, excellent-CV devs -- would toss over garbage code and call it a day.
Comment by annjose 9 hours ago
1) Amen 2) I wonder if this is isolated to junior dev only? Perhaps it seems like that because junior devs do more AI assisted coding than seniors?
Comment by morning-coffee 10 hours ago
Comment by imiric 11 hours ago
That is part of it, yes, but there are many others, such as ensuring that the new code is easy to understand and maintain by humans, makes the right tradeoffs, is reasonably efficient and secure, doesn't introduce a lot of technical debt, and so on.
These are things that LLMs often don't get right, and junior engineers need guidance with and mentoring from more experienced engineers to properly learn. Otherwise software that "works" today, will be much more difficult to make "work" tomorrow.
Comment by emsign 12 hours ago
Comment by fjfaase 9 hours ago
Comment by Nizoss 8 hours ago
Comment by fjfaase 4 hours ago
The first was because we were svn (and maybe even csv before that, but I cannot remember) and that did not support branching easily. That team did switch to git, which did not go with its some struggles, and misconceptions, such as: "Never use rebase."
The second team was already working without branches and releasing a new version of the tool (the Bond3D Slicer for 3D printing) every night. It worked very well. Often we were able to implement and release new features within two or three days allowing the users to continue with their experiments.
When after some years the organization implemented more 'quality assurance' they demanded that we would make monthly releases that were formally tested by the users, we created branches for each release. The idea was that some of the users would test the releases before they were official released, but that testing would often take more than a month, one time even three months, because they were 'too busy' to do the formal review. But at the same time some users were using the daily builds because these builds had the features implemented that they needed. As a result of this, the quality did not improve and a lot of time was wasted, although the formal quality assurance, dictated by some ISO standard, was assured.
I have no experience with moving away from using branches. It might be a good idea to point your manager/team lead/scrum master to dora.dev or the YouTube channel: https://www.youtube.com/@ModernSoftwareEngineeringYT
Comment by ekjhgkejhgk 11 hours ago
Comment by throwaway2027 12 hours ago
Comment by 6510 7 hours ago
Comment by venturecruelty 8 hours ago
Comment by nolineshere 9 hours ago
Comment by sapphirebreeze 9 hours ago
Comment by TheSamFischer 9 hours ago
Comment by ekjhgkejhgk 11 hours ago
Comment by koakuma-chan 11 hours ago
Comment by 9rx 12 hours ago
Your job is to solve customer problems. Their problems may only be solvable with code that is proven to work, but it is equally likely (I dare say even more likely) that their problem isn't best solved with code at all, or even solved with code that doesn't work properly but works well enough.
Comment by wrsh07 11 hours ago
From the post and the example he links, the point is that if you don't at least look at the running code, you don't know that it works.
In my opinion the point is actually well illustrated by Chris's talk here:
https://v5.chriskrycho.com/elsewhere/seeing-like-a-programme...
(summary of the relevant section if you're not going to click)
>>>
In the talk "Seeing Like a Programmer," Chris Krycho quotes the conductor and composer Eímear Noone, who said:
> "The score is potential energy. It's the potential for music to happen, but it's not the music."
He uses this quote to illustrate the distinction between "software as artifact" (the code/score) and "software as system" (the running application/music). His point is that the code itself is just a static artifact—"potential energy"—and the actual "software" only really exists when that code is executed and running in the real world.
Comment by 9rx 11 hours ago
Your tests run the code. You know it works. I know the article is trying to say that testing is not comprehensive enough, but my experience disagrees. But I also recognize that testing is not well understood (quite likely the least understood aspect of computer science!) — and if you don't have a good understanding you can get caught not testing the right things or not testing what you think you are. I would argue that you would be better off using your time to learn how to write great tests instead of using it to manually test your code, but to each their own.
What is more likely to happen is not understanding the customer needs well enough, leaving it impossible to write tests that align with what the software needs to do. Software development can break down very quickly here. However, manual testing does not help. You can't know what to manually test without understanding the problem either. However, as before, your job is not to deliver proven code. Your job is to solve customer problems. When you realize that, it becomes much less likely that you write tests that are not in line with the solution you need.
Comment by daedrdev 12 hours ago
Comment by webdev1234568 12 hours ago
Edit: I'm an idiot ignore me.
Comment by simonw 12 hours ago
It has emdashes because my blog turns " - " into an emdash here: https://github.com/simonw/simonwillisonblog/blob/06e931b397f...
Comment by webdev1234568 11 hours ago
Comment by minimaxir 7 hours ago
Comment by ramon156 12 hours ago
Comment by jairuhme 12 hours ago
Comment by ai_coder42 10 hours ago
If we are accepting LLM generated code, we should accept LLM generated content as long as it is "proof read" :)
Comment by zkmon 12 hours ago
Just a wild thought, nothing serious.
Comment by throwuxiytayq 11 hours ago
Comment by Rperry2174 12 hours ago
We already delegate accountability to non-humans all the time: - CI systems block merges - monitoring systems page people - test suites gate different things
In practice accountability is enforced by systems, not humans.. humans are defintiely "blamed" after the fact, but the day-to-day control loop is automated.
As agents get better at running code, inspecting ui state, correlating logs, screenshots, etc they're starting to operationally be "accountable" and preventing bad changes from shipping and producing evidence when something goes wrong .
At some point humans role shifts from "i personally verify this works" to "i trust this verification system and am accountable for configuring it correctly".
Thats still responsibility, but kind of different from whats described here. Taken to a logical extreme, the arguement here would suggest that CI shouldn't replace manual release checklists
Comment by simonw 11 hours ago
Human collaboration works on trust.
Part of trust is accountability and consequences. If I get caught embezzling money from my employer I can lose my job, harm my professional reputation and even go to jail. There are stakes!
I computer system has no stakes, and cannot take accountability for its actions. This drastically limits what it makes sense to outsource to that system.
A lot of this comes down to my work on prompt injection. LLMs are fundamentally gullible: an email assistant might respond to an email asking for the latest sales figures by replying with the latest (confidential) sales figures.
If my human assistant does that I can reprimand or fire them. What am I meant to do with an LLM agent?
Comment by dfxm12 11 hours ago
Comment by hyperpape 11 hours ago
Comment by pjc50 10 hours ago
Accountability is about what happens if and when something goes wrong. The moon landings were controlled with computer assistance, but Nixon preparing a speech for what happened in the event of lethal failure is accountability. Note that accountability does not of itself imply any particular form or detail of control, just that a social structure of accountability links outcome to responsible person.
Comment by bluesnowmonkey 9 hours ago
So the accountability situation for AI seems not that different. You can fire it. Exactly the same as for humans.
Comment by dkdcio 11 hours ago
if you put them (without humans) in a forrest they would not survive and evolve (they are not viable systems alone); they are not taking action without the setup & maintenance (& accountability) of people
Comment by robryk 11 hours ago
Comment by cess11 11 hours ago
Comment by falcor84 11 hours ago
Comment by sc68cal 11 hours ago
Comment by almostdeadguy 11 hours ago
Perhaps an unstated and important takeaway here is that junior developers should not be permitted to use an LLMs for the same reason they should not hire people: they have not demonstrated enough skill mastery and judgement to be trusted with the decision to outsource their labor. Delegating to a vendor is a decision made by high-level stakeholders, with the ability to monitor the vendor performance, and replace the vendor with alternatives if that performance is unsatisfactory. Allowing junior developers to use LLM is allowing them to delegate responsibility without any visibility or ability to set boundaries on what can be delegated. Also important: you cannot delegate personal growth, and by permitting junior engineers to use an LLM that is what you are trying to do.
Comment by SunshineTheCat 9 hours ago
LLMs do make mistakes. They do a sloppy job at times.
But give it a year. Two years. five years. It seems unreasonable to assume they will hit a plateau that will prevent them from being able to build, test, and ship code better than any human on earth.
I say this because it's already happened.
It was thought impossible for a computer to reach the point of being able to beat a grandmaster at chess.
There was too much "art," experience, and nuance to the game that a computer could ever fully grasp or understand. Sure there was the "math" of it all, but it lacked the human intuition that many thought were essential to winning and could only be achieved through a lifetime of practice.
Many years following Deep Blue vs. Garry Kasparov, the best players in the world laugh at the idea of even getting close to beating Stockfish or any other even mediocre game engine.
I say all of this as a 15-year developer. This happens over and over again throughout history. Something comes along to disrupt an industry or profession and people scream about how dangerous or bad it is, but it never matters in the end. Technology is undefeated.
Comment by newsoftheday 9 hours ago
That's the thing though, AI doesn't understand, it makes us feel like it understands, but it doesn't understand anything.
Comment by simonw 9 hours ago
Comment by xmodem 9 hours ago
> It was thought impossible for a computer to reach the point of being able to beat a grandmaster at chess.
This is oft-cited but it takes only some cursory research to show that it has never been close to a universally-held view.
Comment by SunshineTheCat 8 hours ago
Comment by xmodem 6 hours ago
Today I learned that AI advocates being overly optimistic about its trajectory is actually not a new phenomenon - it's been happening for more than twice my lifetime.
Comment by asadotzler 8 hours ago
Comment by SunshineTheCat 6 hours ago
The fact that you gave me the "you just don't understand, you're not a chess grandmaster" emotional response helps indicate that I'm pretty much right on target with this one.
FWIW I have been engineering software for over 15 years.
Comment by JackSlateur 9 hours ago
This happens over and over again throughout history.
Could you share a single instance of a machine that think ? Are we sharing the same timeline ?Comment by bluesnowmonkey 8 hours ago
First of all, no it’s not. Your job is to help the company succeed. If you write code that works but doesn’t help the company succeed, you failed. People do this all the time. Resume padding, for example.
Sometimes it’s better for the business to have two sloppy PRs than a single perfect one. You should be able to deliver that way when the situation demands.
Second, no one is out there proving anything. Like formal software correctness proofs? Yeah nobody does that. We use a variety of techniques like testing and code review to try to avoid shipping bugs, but there’s always a trade off between quality and speed/cost. You’re never actually 100% certain software works. You can buy more nines but they get expensive. We find bugs in 20+ year old software.
Comment by just_once 10 hours ago
I guess to me, it's either the case that LLMs are just another tool, in which case the already existing teachings of best practice should cover them (and therefore the tone and some content of this article is unnecessary) or they're something totally new, in which case maybe some of the already existing teachings apply, but maybe not because it's so different that the old incentives can't reasonably take hold. Maybe we should focus a little bit more attention on that.
The article mentions rudeness, shifting burdens, wasting people's time, dereliction. Really loaded stuff and not a framing that I find necessary. The average person is just trying to get by, not topple a social contract. For that, look upwards.
Comment by dkural 10 hours ago
Comment by just_once 9 hours ago
Comment by simonw 10 hours ago
A lot of people using LLMs seem not to have understood that you can't expect them to write code that works without testing it first!
If that wasn't clearly a problem I wouldn't have felt the need to write this.
Comment by just_once 9 hours ago
My intention isn't to argue a point, just to share my perspective when I read it.
I read your response here to be saying something like "I noticed that people are misunderstood about X, so I wanted to inform them". In this case "X" isn't itself very obvious to me (For any given task, why can't you expect that a cutting edge LLM would be able to write it without requiring your testing that?) but most importantly, I don't think I would approach a pure misunderstanding (tantamount to a skills gap) with your particular framing. Again, to me it reads as patronizing.
Love the pelican on the bicycle, though. I think that's been a great addition to the zeitgeist.