Claude Code's DX is too good. And that's a problem

Posted by lnbharath 1 day ago

Comments

Comment by pico303 1 day ago

The example in the article of letting Claude deploy the app worries me. It has me thinking of that line, “AI is really good until you know what you’re talking about.” If the author was clueless of how to deploy the app, how do they know the app was deployed safely or securely?

Just this past week I asked Claude for some help with C++ and a library I was somewhat unfamiliar with. What it produced looked great—-if you didn’t know C++ very well. It turned out Claude knew even less about this library than I did, generating tons of code that was completely incorrect. I eventually solve my problem through research and trial and error, and it was nothing like what Claude recommended. It certainly didn’t leave me feeling confident enough to let the LLM have the level of control over my computer or project that the author is allowing it in the article.

I’m not looking forward to a future spending all my time cleaning up the messes LLM’s create.

Comment by forgotpwd16 1 day ago

>"AI is really good until you know what you’re talking about."

Maybe this is a case well represented by the bell curve meme? "AI is great; can do everything" (but you've no domain knowledge so cannot guide it and everything means autonomous creation, so when eventually reaches a roadblock will have no idea what to do), "AI is really good until you know what you’re talking about" (then seemingly doesn't work and is even counterproductive), "AI is great; can do everything" (you've domain knowledge and can guide it and everything means application and assistance).

Essentially rather hope for the LLM to create all by itself as seems to be the current case for many, you should be able utilize your knowledge and have it assist you to both generate an initial code and converge it to where you want.

Comment by 21 hours ago

Comment by christophilus 1 day ago

I agree you need to know what you’re doing. But Claude Code is definitely better than I am at some things- probably the most important of which is starting some mundane task that I would otherwise procrastinate indefinitely.

It’s very good at Typescript, search, and research, but still does stupid stuff and requires review and steering.

I don’t get into the same flow while using it, either, but I think that might be a matter of time. I find it allows me to spend more of my time thinking at a higher level. I could see myself learning to really enjoy that. Code review is exhausting, though, and has always been my least favorite aspect of the job. It seems my future is going to be code-review-heavy, and that is probably the biggest drawback.

Comment by cgearhart 1 day ago

“Better than me” != “good”

I know approximately nothing about approximately everything. Claude seems pretty good at those things. But in every case I’ve used Claude Code for something I do know about it’s been unsatisfactory as a solo operator. It’s not useless, but it is basically useless for anything serious unless you’re very actively guiding it.

I think it has a lot of potential value and will become more useful over time, but it’ll be most useful when we can confidently understand the limitations.

Comment by christophilus 1 day ago

I know a lot about Typescript and its ecosystem. I’ve taught it to students, and worked on it at companies whose names you’d recognize. Claude Code is better than I am at some things that I know deeply, in some cases. It does stupid things on occasion (like use global mutable state), but it is still more useful than not. So, I guess it depends on how you define “better”, but I’ve learned things I didn’t know, and it allows me to do projects and experiments that I’d otherwise be too lazy to do.

Comment by skydhash 1 day ago

You forgot to mention that you're a cat on the internet.

Comment by lnbharath 1 day ago

yeah- this is a fair concern and I should have been clearer. I wouldnt do this on anything with real data or production traffic. that hetzner instance was a side project with nothing sensitive on it. the point was more about claudes ability to reason through infrastructure problems not that everyone should hand over ssh keys. you're right to be cautious

Comment by chrisandchris 1 day ago

> But then I mentioned I had credits on Fly.io, was eligible for Vercel's free tier, had a Cloudflare account, and a Neon database.

I miss the days where deploying an app was just uploading some files. Maybe we need AI to understand this artificial complexity we introduced ourselves?

Comment by pico303 1 day ago

Right there with you. I’m working on fixing an app deployment this weekend myself and dreading picking my way through GitHub actions, ansible scripts, container configs, and deployment APIs to figure how why the thing stopped deploying. Thank goodness it’s just deploying to VMs and not Kube, or I’d probably lose a week.

Comment by koolba 1 day ago

> What happened next: Claude installed every CLI, prompted me to login once, then went into autopilot. Configured each service. Ran commands. Checked logs. Auto-corrected errors. Got the app running in minutes.

> In another instance, a GitHub workflow was failing. Claude asked if it could SSH into my Hetzner instance to investigate. I said yes. It connected, looked up the config, restarted the Docker instances causing issues, and renewed some certificates as a hygiene step - which I never asked it to do.

This type of thing scares the crap out of me and I’m flabbergasted that anyone wold give an LLM unrestricted shell access to a server.

Comment by jabedude 1 day ago

I've also noticed Claude "running away" and doing a bunch of work I never asked it to do.

Comment by candiddevmike 1 day ago

And I'm sure just "restarting docker instances" fixed the root problem here...

The maintenance costs here are going to be eye watering.

Comment by codegladiator 1 day ago

> With Opus 4.5, Claude Code feels like having a god-level engineer beside you. Opinionated but friendly. Zero ego.

Who keeps forgetting variable names and function calling conventions it used 4 seconds ago while using 136 GBs of ram for the cli causing you to frequently force quit the whole terminal. Its not even human level.

Comment by philipp-gayret 1 day ago

Context is garbage in, garbage out.

Comment by tomashubelbauer 1 day ago

My entire codebase is in a certain style that's very easy to infer from just looking around in the same file, yet Claude Code routinely makes up its own preferences and doesn't respect the style even given an instruction in CLAUDE.md. Claude Code brings its own garbage even when there's plenty of my own garbage to glean from. That's not what GIGO is supposed to be.

Comment by Trasmatta 1 day ago

And then hallucinating APIs that don't exist, breaking all the unit tests and giving up saying they're an "implementation detail", and over engineering a horrific class that makes Enterprise Fizzbuzz look reasonable

Comment by exe34 1 day ago

I've been running claude code on a 13 year old potato and it's never used 136GB of RAM - possibly because I only have 8GB.

Comment by codegladiator 1 day ago

Its vram or something makes the OS completely busy even I have only 32 gb ram. task manager shows 100+ gbs forcing to terminate

Comment by exe34 1 day ago

is that vram on your GPU? I don't think claude code uses that.

Comment by codegladiator 22 hours ago

Not on GPU, I think it's just paged memory. You are right claude-code isn't running the model locally. Today I've had to kill it 5 times till now.

edit: https://ibb.co/Fbn8Q3pb

that's the 6th

Comment by lostmsu 5 hours ago

Why do you think it's Claude and not iTerm?

Comment by codegladiator 4 hours ago

been using iterm for 10 years. Didn't update recently. claude code is the only new factor in my setup. I can visibly predict as i am using claude code when its about to happen (when conversation goes above 200 messages and then uses sub agents leading to somehow infinite rerendering of the message timeline and they seemingly use a html to bash rendering thing because ... ) so yeah maybe you are right iterm is not able to handle those rerendering or maybe the monitor is broken.

Comment by exe34 1 hour ago

I use xterm, and the visual glitch doesn't crash anything, so maybe try that? I suspect though maybe you're using much longer sessions than I do, with the talk of sub agents and all.

I've mostly just been using it for single features and then often just quitting it until I have the next dumb idea to try out.

Comment by insane_dreamer 1 day ago

Except a god-level engineer wouldn't write unit tests that pass but don't actually test anything because it mocked the responses instead of testing the _actual_ responses, so your app is still broken despite tests passing and "victory!" claims by the "engineer".

Just one example of many personal experiences.

It is helpful, and very very fast at looking things up, sifting through logs and documentation to figure out a bug, writing ad-hoc scripts, researching solutions; but definitely junior-level when it comes to reasoning, you really have to keep your thinking cap on and guide it.

Comment by charlesabarnes 1 day ago

What scares me about Claude Code (and ai developer tools in general) is that a small model update could change how I interact with the tool entirely. There's no freezing the communication style that I need to use for good results.

Comment by Workaccount2 1 day ago

I suspect that this is already a big reason why we get so many conflicting signals on "the best coding model". People tune into the style of the model they use the most, and hit snags and friction when they take another model for a spin.

Most iOS users report that Android is a disaster of an operating system, with layers and layers of user frustration. In reality, they actually are just totally in tune with how iOS does stuff. I can only imagine we have something similar going on here.

Comment by fluidcruft 1 day ago

My impression is people build very fragile Rube Goldberg devices on top of the models and those things break. That's not to say anything is wrong about the Rube Goldberg machines! They're very interesting and do new things (and they help me understand how things work). I'm just saying that there's probably a significant misattribution about where the fragility exists.

https://xkcd.com/1172/

Comment by Rikudou 1 day ago

I wish one day to be so brave to let a tool I clearly don't understand* ssh to a production server with root access**.

* calling it a god-level programmer kinda gave it away they have no idea what's actually going on

** to restart docker containers you either have to be root or part of the docker group which effectively gives you root privileges

Comment by lostmsu 5 hours ago

The consensus is that nobody should have root SSH to a production server.

Comment by fluidcruft 1 day ago

I sort of agree with this about cognitive load. I'm somewhat new (started dipping my toes around July) but use Claude code heavily now. I did spend a lot of time playing with configuring it at first and creating agents etc. But I have a weird setup where I have three computers that I work on and at one point I realized vanilla Claude Code had adopted the things I was doing as defaults (and improved on them). So I have sort of declared a configuration bankruptcy and just use the recommended things. The only things I still do are things that help both Claude and I keep track of things (md files describing decisions and context of files).

[I still haven't figured out MCP or how/why to use them or why to bother. You run servers. I guess. It's too complex for my smol brain to understand]

Comment by nip 1 day ago

> I still haven't figured out MCP or how/why to use them or why to bother. You run servers. I guess. It's too complex for my smol brain to understand

I know this is self-deprecating humor, but you do NOT have a smol brain: MCP servers are not as needed anymore now that Claude Code supports "Skills". They are also very token hungry as their spec is not lazy-loaded like the skills.

It was / and still is very useful if you collaborate with other engineers or want to perform operations in a non-stochastic fashion.

MCP servers are a way to expose a set of APIs (openAPI spec) to an LLM to perform the listed operations in a deterministic fashion (including adding some auditing, logging, etc). LLMs are fine-tuned for tool calling, so they do it really well and consistently.

Examples:

- Documentation / Glossary: MCP server that runs somewhere that reads a specific MD file or database that is periodically updated: think "what are my team members working on" / "let me look up what this means in our internal wiki".

- Gating operations behind authentication: a MCP server that is connected to your back office and allows you to upgrade a customer's plan, list existing customers, surface current. Super useful if you're a solo-founder for example.

Comment by egamirorrim 1 day ago

It's more like you want Claude to interact with X, and you go to see if there's an MCP server for it.

Claude could use the API directly but most MCP now comes with OAuth so you can let it act as you, in case API keys are hard to come by or chargeable. Sometimes with a good skill or a preconfigured CLI tool skills can be just as good if not far more powerful than an MCP server.

But the trigger you'd look for to decide to use an MCP is 'i wish Claude could access X'. My top examples:

- pulling designs from figma to implement them - fetching ticket context for a job from JIRA - getting a stack trace to investigate from Sentry

Comment by esafak 1 day ago

If you haven't grokked MCP yet don't bother now; it's on the way out. Instead do learn to write an AGENT.md file (create a CLAUDE.md file to point to it) then list all the tools you have at its disposal. It will probably know how to use them; it just needs to be told what's available.

Comment by lnbharath 1 day ago

I definitely relate with your sentiment and I like your term "configuration bankruptcy"

on MCP, the mental model that clicked for me is "giving claude access to tools it can call" so that instead of copy pasting from your database or API, claude can just... query it

playwright MCP for me is godsend

Comment by tanmay001 1 hour ago

Nice way to put it. Skills feel great for shaping how Claude works inside a repo, while MCP really shines when you want it to talk to “live” systems : databases, test runs, CI, all that external state.

Comment by Lx1oG-AWb6h_ZG0 1 day ago

I thought skills were supposed to help with “giving claude access to tools it can call”. When would one use MCP over skills?

Comment by lnbharath 1 day ago

skills are basically markdown files that teach claude how to do something. they live in your repo and load on demand.

MCP is for when you need claude to actually interact with external systems like querying a database, hitting an API, etc...

Comment by tstrimple 1 day ago

I've not explicitly used skills or MCP, but have had zero issues with Claude calling apis via curl as an example. I'm not sure what the MCP server or skill is actually enabling at this point. If I wanted CC to talk to SQL Server, I'd have it open a nix-env with the tools needed to talk to the database. One of my primary initial claude.md entries has to do with us running on NixOS and that temporarily installing tools is trivial and it should do things in the NixOS way whenever possible. Since then it has just worked with practically everything I've thrown at it. Very rarely do I see it trying to use a tool that isn't installed anymore. CC even uses my local vaultwarden where I have a collection of credentials shared with it. All driven through claude.md.

Comment by SatvikBeri 1 day ago

For what it's worth, I've been using a fairly minimal setup (24 lines of CLAUDE.md, no MCPs, skills, or custom slash commands) since 3.7 and I've only noticed Claude Code getting significantly better on each model release.

Comment by fuckinpuppers 8 hours ago

Share it!

Comment by skydhash 1 day ago

> With Opus 4.5, Claude Code feels like having a god-level engineer beside you. Opinionated but friendly. Zero ego.

> Claude was halfway through refactoring a complex auth flow[...] Then I realized: I'd forgotten to mention that one of those files was also used by a cron job.

That is the kind of research you do before you go to refactoring.

> Claude Code freed them from "the anxiety of the first step in programming constantly."

Is there a first step in programming? If there is, that would be thinking because you ought to get a good solution in mind before even typing the first line of code.

...

The whole article feels like someone roleplaying as software developer. Not that there's a barrier or a license to be one, but just that whole piece seems like as accurate as hackers portrayal in movies.

Comment by konart 1 day ago

>Is there a first step in programming? If there is, that would be thinking

I'd argue that a first step (regardless of field) is either a necessity or curiosity.

Thinking comes later. (And to be honest I can't really think of it as a "step", simply because this is a process).

Comment by gedy 1 day ago

I use Claude and not crapping on it or the boosters, but most of the people who I've encountered in real life who really gush about it, are either semi-technical, and who struggle in some way when coding. Non technical CEO of small startup I was at, manager who didn't go to college, etc.

It's a cool tool! Just am tired of being treated like anti AI because I don't outsource my brain to it, or gush over their DIY UI demos.

Comment by skydhash 1 day ago

Yeah it’s just a LLM, able to do text transformations based on what latent factors it has captured in the training stage. But programming is not text manipulation, just like writing is not only putting words on paper and music is not merely producing sounds.

Anyone can sat before a piano and start hitting keys. And someone can rig up some apparatus that highlight which keys to hit at precise times in order to play Moonlight Sonata. But no one will call those people pianist. Sure a good pianist can use such apparatus to learn to play a piece. But he may also not buy such thing and just use a music sheet.

Comment by lnbharath 1 day ago

Author here. I wrote this because everyone is talking about Claude Code right now and it's all over my timeline. Claude Code has this effect where you KNOW it's good but can't quite say WHY.

So I spent the weekend digging into the DX decisions that make Claude Code delightful.

Comment by MasterScrat 1 day ago

How much AI did you use to write up this article? It tripped up my "fake AI-written article" detector a few times despite being interesting enough to read to the end

Comment by geophph 1 day ago

“Here’s the thing” “The best part?”

Comment by bentcorner 1 day ago

"It's not just X, it's Y"

I find it really hard to read articles that use AI slop aphorisms. Please use your own words, it matters.

Comment by Rikudou 1 day ago

What if I no good in English?

Jokes aside, my English is passable and I'm fine with it when writing comments but I'm very aware that some of it doesn't sound native due to me, well, not being native speaker.

I use AI to make it sound more fluent when writing for my blog.

Comment by mcphage 1 day ago

> What if I no good in English?

It would still sound more human coming from you.

Comment by exe34 1 day ago

As long as your bullet points+prompt are shorter than the output, couldn't you post that instead? The only time I think an LLM might be ethically acceptable for something a human has to read is if you ask it to make it shorter.

Comment by Rikudou 1 day ago

I write the full article in my Czenglish (English influenced by Czech sentence structure). Then I let it rewrite it in proper English.

So it's me doing the writing and GPT making it sound more English.

Comment by geophph 1 day ago

Yeah it’s hard to keep interest when there’s no voice, just the same AI feel that you see everywhere else.

Comment by fragmede 1 day ago

Well, actually, what if my own words make me come across as a raging pedantic asshole, you feckless moron!? I don't actually think you're a feckless moron, but sometimes I'll get emotional about this or that, and run my words through an LLM to reword it so that "it's not assholey, it's nice". I may know better than to use the phrase "well actually" seriously these days, but when the point is effective communication, yeah I don't want my readers to be put off by AI-isms, but I also don't want them to get put off by my words being assholey or condescending or too snarky or smug or any number of things that detract from my point. And fwiw, I didn't run this comment through an LLM.

Comment by lnbharath 1 day ago

used claude to polish the draft and tighten sentences. the thinking, analysis, and examples are all mine and based on personal experiences. spent the weekend reflecting on my past experiences with claude code and actually digging into why claude code feels the way it does. curious to know what tripped your detector.

Comment by ler_ 1 day ago

Adding to this: too many negatives before making a point, which AI text is prone to do in order to give surface level emphasis to random points in an argument. For example: "I sat there for a second. It didn't lose the thread. It didn't panic. It prioritized like a real engineer would." Then there is the fact that the paragraph ends in just about the same way, which also activates one's AI-voice-detector, so to speak: "This wasn't autocomplete. This was collaboration."

In my opinion, to write is to think. And to write is also to express oneself, not only to create a "communication object," let's put it that way. I would rather read an imperfect human voice than a machine's attempts to fix it. I think it's worth to face the frustration that comes with writing, because the end goal of refining your own argument and your delivery is that much sweeter. Let your human voice shine through.

Comment by verall 1 day ago

Lots of things - typical llm em-dash situations although using dash. Lists of 3s after a colon where the 3 items aren't great. Short sentences for "impact" that sounds kind of like a high school essay i.e. "God level engineer...Zero ego."

I cannot at all understand writing an essay and then having an llm "tighten up the sentences" which instead just makes it sound like slop generated from a list of bullets

Comment by amtamt 1 day ago

> I wrote this because everyone is talking about Claude Code right now and it's all over my timeline.

Feels more like peer pressure induced post, than evaluating a tool critically for pros and cons.

> Claude Code has this effect where you KNOW it's good but can't quite say WHY.

Definitely gives the "vibe" of social media's infinite scroll induced dopamine rush.

Overall, this post just seems to be enforcing the idea that "fuzzy understanding of business domain will be enough to get a mature product using some AI, and the AI will somehow magically figure out most non-functional requirements and missing details of business domain". Thing is that figuring out "most non-functional requirements and missing details of business domain" is where most of the blood and sweat goes.

Comment by erichocean 1 day ago

Do you have any sources you've found that document Claude Code's UI in detail? I'm really curious about what they've built, as a UI designer.

Comment by lnbharath 1 day ago

Anthropic has some docs at docs.anthropic.com but honestly most of what I learned came from just using it and poking around. the slash commands have help text built in. shrivu shankar has a good breakdown of the features too if you're looking for a more structured overview

Comment by erichocean 1 day ago

Thanks!

Comment by buster 1 day ago

I've not used Claude Claude yet, but why would it be bad if it gains features that people use? Did people ever complain about Photoshop to have too many features demanding some cognitive load? Excel? Practically every IDE out there? There is a reason people use those tools instead of the plain text editor or paint. It's for power users and people will become power users of AI as well. Some will forever stick to chatgpt and some will use an ever increasing ecosystem of tools.

Comment by ch2026 1 day ago

because devs will have no clue how their systems work, the only ones who do will be LLMs, gatekept behind an ever-increasing cost-per-usage.

Comment by charlesabarnes 1 day ago

It is a very significant consideration for every one of those tools. The introduction of the "ribbon" in Excel was moderately controversial in 2007.

The default tools made available in Photoshop is why it remains on top to this day.

Comment by lnbharath 1 day ago

good question. the difference with AI tools is the interface isn't stable in the same way photoshop or excel is. with traditional software you learn it once and muscle memory carries you. with LLM tools the model itself changes, the optimal prompting style shifts, features interact with model behavior in unpredictable ways. so the cognitive load compounds differently. not saying features are bad, just that the tradeoffs are different

Comment by PaulHoule 1 day ago

I feel the same way about Junie.

I have an "image sorter" that sucks in images from image galleries into tagging system with semantic web capabilities "Character:Mona -> Copyright:Genshin_Impact" and ML capabilities (it learns to tag the same way you do)

Gen 1 of it was cued by a bookmarklet to have a webcrawler pull the gallery HTML and the images. I started having Cloudflare problems so Gen 2 worked by saving complete pages and importing the directories, that had problems so I was looking at a Gen 3 using a bookmarklet to fetch and POST the images out of the browser into the server so I tell Junie my plan and it tells me I'll have CORS trouble and "Would you consider making a browser extension?"

Well I had considered that but was intimidated at the prospect, figured I'd probably have to carve out an uninterrupted weekend to study browser extensions, kick my son out of the house to go busk with his guitar instead of playing upstairs (the emotional/social bit is important) and even then have a high chance of not really getting it done and then end up taking another month to get an uninterrupted weekend. I told Junie "I've got no idea how you do that, don't you need a build system, don't you need to sign it?" and it said "No, you can just make the manifest file and a JS file and do a temporary install" so I say "make it so" and in 20 minutes I have the browser extension.

It still isn't working end-to-end, but I'm now debugging it and ought to be able to get it working in a weekend with interruptions even if I didn't get any more AI help.

Comment by phplovesong 1 day ago

SlopDX.

Comment by ThouYS 1 day ago

what on earth is a DX?

Comment by ycuser2 1 day ago

Developer Experience (as in UX - User Experience)

Comment by 1 day ago

Comment by royal_ts 1 day ago

Developer Experience

Comment by GiorgioG 1 day ago

If only LLMs didn’t just make shit up regularly.

Comment by ltbarcly3 1 day ago

They both make stuff up and make very obvious mis-interpretations of evidence. If you take the output of an LLM, and ask another LLM to check it, this dramatically reduces this. Even if you do it with the same LLM but without the existing context. I was able to write a detailed analysis of a rule system by doing this with 3 steps, claude -> chatgpt -> gemini3. It caught all the mistakes, including overstatements and vague statements. It wasn't perfect, but even after one review the # of mistakes or stupid statements was almost 0.

Comment by erichocean 1 day ago

If a coding agent was released that never made anything up, how much would that change things for you?

Comment by geophph 1 day ago

I’d save a lot of time from not choosing to smugly telling the AI how wrong it was just for my own reassurances that at least for now I’m still more useful than it is.

Comment by ltbarcly3 1 day ago

"With Opus 4.5, Claude Code feels like having a god-level engineer beside you."

Well, not to me or the people I respect. It's getting very good, but it's like having a recent college grad who obsessively reads documentation. Someone with low skill but very high knowledge, often knowledge they are mixing up or not quite getting right.

I think if Claude is already 'better' at coding than you, maybe think about going back to college to be a lawyer or something. For the rest of us, lets just hope that Claude hits some natural limit before it gets better than us too. If it doesn't hit some limit I think we have a year or two.

Comment by dawnerd 1 day ago

Yet when I try it, it feels like a developer fresh out of a coding bootcamp with no real experience. There’s no real reasoning, problem solving is still brute forced. It still rewrites rather than modifies. The context is way too limiting and it gets lost in its own “thinking”

Comment by coffeefirst 1 day ago

Except I’ve coached a lot of those people and there’s usually a method to their madness. You can sit down, help them back up to where they went off the rails, and it all makes sense.

Robot code is bonkers.

Comment by ltbarcly3 1 day ago

The problem solving is very very not brute force. I have seen it make detailed analysis. Often if a problem stumps me it also stumps Claude, no surprise there. But if I give it a ticket I haven't looked at yet, it is often able to find the exact problem via careful 'reasoning' and fix it in 1/100th the time it would take me.

Comment by salomonk_mur 1 day ago

I had it One-shot the full architecture for a fairly advanced distributed system for a client. It then one shot the actual code design (following absolutely all our our internal requirements on auth, stack to use, security, code styling, documentation, etc). It then one shot (and we code reviewed everything thoroughly) each of the 5 micro services needed.

It one shot the infrastructure to use and created the terraform file to put it up anywhere. It deployed it.

It caught some of the errors it had made by itself after load-testing, and corrected them. It created the load test itself (following patterns from previews projects we had).

It did all of this in a week. With human supervision on each step, but in a fucking week. We gave it all the context it needed and one-shotted everything.

It is more than god-level. If you are not getting these increases in productivity, you are using it wrong.

Comment by exe34 1 day ago

Hey would you be willing to share your claude.md? I'm only starting out with AI coders, and while it often makes good choices for straightforward things, I find the token usage gets bigger and bigger as it proceeds down a list of requirements - my working hypothesis is that it's having to re-read everything as the project gets more complicated and doesn't have a concept of "this is where I go to kick it for this kind of thing".

Comment by ltbarcly3 1 day ago

Lol ok dude, good luck with your 'I just resell the output of Claude and I can't tell when it makes mistakes' business model. I'm sure it is a long term valid economic niche.

Comment by salomonk_mur 1 day ago

Before, I resold the output of engineers.

Now, I resell the output of AI supervised by engineers.

We can tell when it makes mistakes. It used to make a ton. Now, with the right context, it really makes very few mistakes (which it can find itself and fix itself)

Comment by tolerance 1 day ago

We are witnessing the emasculation of software developers.

Comment by 01HNNWZ0MV43FF 1 day ago

Thesis: LLMs are a powerful tool

Antithesis: LLMs are emasculating

Synthesis: Programmers should be feminized

Comment by tmoravec 1 day ago

> In a monorepo, just loading the project consumes ~20k tokens

I don't work on a monorepo, and as an example, what I would consider a mid-size service in my mid-size company is 7M tokens.

I can't but ask: do all people who are so enthusiastic about AI for coding only work on trivial projects?

Comment by SatvikBeri 1 day ago

I'm pretty enthusiastic about LLMs and use them on my 8 year old codebase with ~500kloc. I work at a hedge fund and can trace most of my work to dollars.

Comment by peacebeard 1 day ago

I dunno if I’d fall myself “enthusiastic” but I successfully use AI on a large production monorepo. The onus is on the user to break down the problem into llm-sized bites. How to do this effectively is a skill that takes time to develop. You’re not crazy: if you go in and ask it to do things in broad strokes it won’t work.

Comment by raincole 1 day ago

I'm quite sure "loading the project" isn't putting every single line of code into the context. Probably just a huge CLAUDE.md or something.

Either that or this author is completely out of touch with reality.

Comment by lnbharath 1 day ago

should have been clearer here. by "loading the project" I meant the initial context claude builds like CLAUDE.md, directory structure, etc... not literally putting every line of code into context. 7M tokens would obviously not fit in a 200k window

Comment by pjm331 1 day ago

I’m not clear what “just loading the project” even means here - if that’s how many tokens are consumed by system prompt plus Claude.md and MCP tools well that has nothing to do with the size of the project

Comment by AaronAPU 1 day ago

I think the agent mode stuff only works well on trivial projects. But the top tier models can be very productive with carefully constructed prompts and manually curated contexts for large mono repos.

Comment by ltbarcly3 1 day ago

You obviously don't have any idea how any of this actually works.

Comment by tmoravec 1 day ago

What do you think "loading the project" means when discussing context?