Codex for almost everything
Posted by mikeevans 21 hours ago
Comments
Comment by cjbarber 21 hours ago
i.e. agents for knowledge workers who are not software engineers
A few thoughts and questions:
1. I expect that this set of products will be extremely disruptive to many software businesses. It's like when a new VP joins a company, they often rip and replace some of the software vendors with their personal favorites. Well, most software was designed for human users. Now, peoples' agents will use software for them. Agents have different needs for software than humans do. Some they'll need more of, much they'll no longer need at all. What will this result in? It feels like a much swifter and more significant version of Google taking excerpts/summaries from webpages and putting it at the top of search results and taking away visits and ad revenue from sites.
2. I've tried dozens of products in this space. For most, onboarding is confusing, then the user gets dropped into a blank space, usage limits are uncompetitive compared to the subsidized tokens offered by OpenAI/Anthropic, etc. It's a tough space to compete in, but also clearly going to be a massive market. I'm expecting big investment from Microsoft, Google etc in this segment.
3. How will startups in this space compete against labs who can train models to fit their products?
4. Eventually will the UI/interface be generated/personalized for the user, by the model? Presumably. Harnesses get eaten by model-generated harnesses?
A few more thoughts collected here: https://chrisbarber.co/professional-agents/
Products I've tried: ai browsers like dia, comet, claude for chrome, atlas, and dex; claw products like openclaw, kimi claw, klaus, viktor, duet, atris; automation things like tasklet and lindy; code agents like devin, claude code, cursor, codex; desktop automation tools like vercept, nox, liminary, logical, and raycast; and email products like shortwave, cora and jace. And of course, Claude Cowork, Codex cli and app, and Claude Code cli and app.
Edit: Notes on trying the new Codex update
1. The permissions workflow is very slick
2. Background browser testing is nice and the shadow cursor is an interesting UI element. It did do some things in the foreground for me / take control of focus, a few times, though.
3. It would be nice if the apps had quick ways to demo their new features. My workflow was to ask an LLM to read the update page and ask it what new things I could test, and then to take those things and ask Codex to demo them to me, but it doesn't quite understand it's own new features well enough to invoke them (without quite a bit of steering)
4. I cannot get it to show me the in app browser
5. Generating image mockups of websites and then building them is nice
Comment by postalcoder 20 hours ago
For all the benefits that agents offer, they can be asymmetrically harmful. This is not a solved issue. That hurts growth. I don't disagree with your general points, though.
Comment by avaer 20 hours ago
At this point it's a foregone conclusion this is what users will choose. It'll be like (lack of) privacy on the internet caused by the ad industrial complex, but much worse and much more invasive.
The threats are real, but it's just a product opportunity to these companies. OpenAI and friends will sell the poison (insecure computing) and the antidote (Mythos et all) and eat from both ends.
Anyone trying to stay safe will be on the gradient to a Stallmanesque monastic computing existence.
I don't want this, I just think it's going down that route.
Comment by hypfer 8 hours ago
Honestly, it's alright.
Just think of what we could do with computers up until this point. We keep all those abilities.
And more, even, because the industry still keeps churning out new local LLMs. So you even gain more capabilities than right now. Just not at the rate of the bleeding edge.
Which is just like the Linux desktop, essentially. It's fine, really. There is no need to consume the bleeding edge. You will be fine.
Comment by Forgeties79 1 hour ago
Comment by multjoy 7 hours ago
As a proud neo-luddite, I'm watching the AI hype with grim amusement and I'll tell you hwhat, it doesn't look like a good time. Even putting to one side the planetary scale economic crash that is incoming, all the hypers seem to be on some sort of treadmill that is out of their control and it simply doesn't look like fun.
Comment by petesergeant 4 hours ago
Comment by multjoy 1 hour ago
Comment by Forgeties79 2 hours ago
We - including the companies - don’t know what the real “billion dollar application” of them is other than the unproven claim it makes everyone more productive in some general sense. When it doesn’t work people continue to say “it’s your fault not the tool’s.” Meanwhile investors are getting skittish and not one AI company is profitable yet. Companies that laid people off for LLM’s are regretting their decisions, leadership (and educators) is dealing with unvetted writing and having to waste their time cleaning it up, the list goes on. “Slop” is still a huge and growing problem.
LLM’s are here to stay, but IMO it’ll be more relevant in the long run than 3D printers yet less revolutionary than the internet. Everyone will touch them at various points but this whole-life, every-industry-disrupted integration still seems far fetched to me. Pricing is still a huge unsolved problem - everyone is still subsidized and despite gains in using fewer resources, it’s still too much to run these locally, even small models (not even getting into tooling and knowledge required to use them in a productive way).
When we zoom out and look at the whole picture, LLM’s have mostly made everyone’s online experience worse while the VC funded companies behind them are playing municipal and state governments’ for suckers a la Amazon getting so many cities to trip over each other giving away land and tax breaks, but far worse. Those are the biggest contributions so far aside from anecdotes from coders about “1000x productivity.” Again, I think they’re here to stay. But it’s called “AI hype” for a reason.
LLM’s have mostly been a problem creator IME rather than a “disruptor.” Never really seen “revolutionary technology” quite like it.
But hey, I’ll admit it’s useful to have a meh local model when I’m writing TTRPG stuff and have writer’s block. Though then I remember how it was trained, a whole other subject I haven’t even touched, so that kind of sucks too.
Comment by Springtime 9 hours ago
The concerning aspect is how others' content being scanned into systems don't have any knowledge or consent. Having private PII/files/code/emails/etc being read and/or accidentally shared by the agent online.
Comment by safety1st 8 hours ago
The model will get full access to your data, but in the name of security, you will only be permitted to have data that is cloud-hosted; local storage will effectively just be cache.
The era of the general computer will end, and the products you purchased from these companies will be nonconsensually altered and limited.
I'm so glad I switched to Linux more than a decade ago. At least on the PC there will still be an open source ecosystem for a long time to come, it may have less features but I'm willing to accept that.
Knowing that they can change what you bought overnight with a single nonconsensual update, think very, very carefully about who you purchase all of your future technology from. Google's upcoming nonconsensual degradation of Android should be a lesson for everybody.
Comment by WarmWash 1 hour ago
Google is almost certainly doing this because the iOS was not found to be a monopoly, while Andorid was. It came up in Google's appeal of the Epic case verdict, where they directly asked the judge about it. Turns out you can't be anti-competitive if you don't have [allow] any competitors.
Comment by daveguy 57 minutes ago
Comment by shevy-java 5 hours ago
Wait until age verification is mandatory everywhere. :)
I can already see that happening, e. g. to access financial transactions or government apps, one needs to verify the id, and that will not work without age verification that can not be tampered with. So Linux will either submit to the same or be excluded.
(That free developers will be able to run Linux fine for much longer will also be true, but I guess they only care about catching the 95%, not the 5% linux users ... and 5% is a high guesstimate).
Edit: To clarify the above, one already had to provide personal data for financial transactions, of course, so a bank knows who is who, but the recent age verification go hand in hand with the attempt to get rid of vpn, and applications now make it a new standard to query the age of users, with the claim to "help protect kids". And some people buy into that rationale too. I don't, but I have seen many non-tech savvy people submit to that justification.
Comment by elictronic 7 hours ago
Comment by intended 20 hours ago
I think most people are going to say they dont want it. I mean, why would anyone want a tool that can screw up their bank account? What benefit does it gain them?
Theres lots of cases of great highly useful LLM tools, but the moment they scale up you get slammed by the risks that stick out all along the long tail of outcomes.
Comment by ryandrake 20 hours ago
On the other hand, entrepreneurs and managers are going to want it for their employees (and force it on them) for the above reason.
Comment by TeMPOraL 7 hours ago
Of course, such situation is only temporary - if I can suddenly be 10X productive, then so can everyone else, and then the baseline shifts so 10X is the new 1X.
Comment by jbstack 6 hours ago
Comment by LinXitoW 59 seconds ago
Comment by TeMPOraL 5 hours ago
If they have any surplus of money (or loans) they'll try, so those 9 employees may end up becoming team leads or middle management, trying to start new initiatives to get the 10x expansion (and 100x improvement).
The market isn't anywhere near efficient enough to directly translate productivity improvements into labor reductions. Thankfully, because everything that's nice and hopeful and human lives within the market inefficiency; a fully efficient market would be a hell worse than any writer or preacher ever imagined.
Comment by sikewj 4 hours ago
I’ve seen a number of your posts where you talk about topics you clearly are not all that well versed in, with such confidence when you’re plain wrong.
Comment by TeMPOraL 2 hours ago
> I’ve seen a number of your posts where you talk about topics you clearly are not all that well versed in, with such confidence when you’re plain wrong.
I'm sure it's true. However, since you brought it up, can you be more specific and name three?
Comment by hvb2 8 hours ago
Those are productivity increases that got our standard of living to where it is. Fewer people doing the same amount of work has, historically speaking, freed people from their current job, allowing them to work on something else.
It's that analogy of the horse, they used to be farm animals. Now, fewer of them are 'employed' but they're much nicer jobs. I'm not sure if the same is true for us this time around though as new jobs being created have increasingly been highly skilled which means the majority can't apply.
Comment by drivebyhooting 8 hours ago
Comment by yes_man 8 hours ago
Of course in reality in the short term what happens is companies lay off people to increase margins. Times will be tough for workers, and equity keeps gravitating towards those who already had it.
Comment by King-Aaron 6 hours ago
If you remove the effort from those tasks, they will have no value.
10x the value of 0 is 0
Comment by intended 5 hours ago
Comment by vovavili 5 hours ago
Given sane working arrangements or at minimum presence of remote work, it would be a bit shortsighted not to want to get done with your work in a tenth amount of time. In the very least, you're competing for a promotion against less effective people, all while having more time for yourself. If not, you're building labor market skillset in an efficient way so you can hop to a better employer.
Comment by procaryote 7 hours ago
I couldn't imagine thinking "I'm gonna do this 0.1x as fast as I could, wasting my life away with pointless extra work, to spite my employer"
Comment by retinaros 20 hours ago
Comment by rurban 9 hours ago
Comment by soraminazuki 5 hours ago
Comment by cjbarber 20 hours ago
Strongly agreed.
I saw a few people running these things with looser permissions than I do. e.g. one non-technical friend using claude cli, no sandbox, so I set them up with a sandbox etc.
And the people who were using Cowork already were mostly blind approving all requests without reading what it was asking.
The more powerful, the more dangerous, and vice versa.
Comment by TeMPOraL 4 hours ago
People have different levels of safety-consciousness, but also different tolerances and threat models.
For example, I would hesitate running a Mythos-level model in YOLO mode with full control over my computer, but right now, for personal stuff, even figuring out WTF are sandboxes in Claude Code / Gemini CLI, much less setting them up, is too much hassle. What's the worst it can do without me noticing? Format the drive and upload some private data into pastebin? Much as I hate cloud and the proliferation of 2FA in every service, that alone means it can't actually do more to me than waste few hours of my life, as I reimage my desktop and restore OneDrive (in case of destructive changes that got synced up). These models are not yet good enough to empty my bank account in few minutes I'm not looking; everything else they can do quickly is reversible or inconsequential.
Now, I do look at things closely when working with agentic AI tools. But my threat model is limited to worrying about those few hours of my life. `rm -rf / --no-preserve-root` is an annoyance, not a danger.
(I accept that different contexts give different threat modeling. I would be more worried if I were doing businessy business stuff with all kinds of secret sauces, or was processing PII of my employer's customers, or lived in a country where it's easy to have all your money stolen if your CC number or SSN gets posted online.)
Comment by Anvoker 5 hours ago
Maybe this kind of isolation neuters the benefit you're thinking of, but I do believe some sort of solution could be reached.
Comment by planb 20 hours ago
Comment by IanCal 8 hours ago
Comment by planb 1 minute ago
Comment by postalcoder 20 hours ago
the attack surface is so wide idk where to start.
Comment by planb 19 hours ago
Comment by canarias_mate 20 hours ago
Comment by MrsPeaches 19 hours ago
I’m semi-normie (MechEng with a bit of Matlab now working as a ceo).
I spend most of my day in Claude code but outputs are word docs, presentations, excel sheets, research etc.
I recently got it to plan a social media campaign and produce a ppt with key messaging and content calendar for the next year, then draft posts in Figma for the first 5 weeks of the campaign and then used a social media aggregator api to download images and schedule in posts.
In two hours I had a decent social media campaign planned and scheduled, something that would have taken 3-4 weeks if I had done it myself by hand.
I’ve vibe coded an interface to run multiple agents at once that have full access via apis and MCPs.
With a daily cron job it goes through my emails and meeting notes, finds tasks, plans execution, executes and then send me a message with a summary of what it has done.
Most knowledge work output is delivered as code (e.g. xml in word docs) so it shouldn’t be that that surprising that it can do all this!
Comment by nonameiguess 17 hours ago
If you can figure out the next step and say "Claude, go find me buyers and sell shit for me without using any pre-existing software," have at it. It can't be social media, I guess, since social media is software and Claude is supposed to get rid of software.
At a certain point, why do we even need computers? Can't we just call Claude's hotline and ask "Claude, please find a way to dump $40 million in cash into my living room. Don't put it in my bank account because banks use software."
Comment by elAhmo 7 hours ago
OP gave a good example how their workflow was changed, you could argue there are tools that could've done that, but they managed to achieve their goals without them, have something that fits their workflow perfectly, is fine tuned in case of changes, and with a few other tools (Word, Excel, Figma) they can do all sorts of things which would've required a small team or far more (expensive) tools to execute.
To me that is a great example of non-developers using tools to enhance their workflows and with initiatives like from this topic, I can only see that increasing.
Comment by TeMPOraL 7 hours ago
It doesn't obviate the need for software, but it greatly devalues software products, as they become reduced to tool calls for LLMs.
This is good for users, because software products are defined by boundaries - borders drawn around the code to focus and package functionality, yes, but also to limit interoperability and create a sales channel (UX being the perfect marketing platform for captive audience).
After all, I don't usually want to play with Word, Excel, PowerPoint, and Figma - they're just standing between me and the artifact I want to create, so if I can get LLM to operate them for me, I don't have to deal with all the UX and marketing bullshit those products throw at me.
I mean, that's what I'd do if I could afford to hire a person to operate those tools for me. That, again, is the best mental model for LLMs - they're little people on a chip, cheaper to employ than actual people.
Comment by aerhardt 19 hours ago
Comment by bob1029 21 hours ago
I agree this is going to be big. I threw a prototype of a domain-specific agent into the proverbial hornets' nest recently and it has altered the narrative about what might be possible.
The part that makes this powerful is that the LLM is the ultimate UI/UX. You don't need to spend much time developing user interfaces and testing them against customers. Everyone understands the affordances around something that looks like iMessage or WhatsApp. UI/UX development is often the most expensive part of software engineering. Figuring out how to intercept, normalize and expose the domain data is where all of the magic happens. This part is usually trivial by comparison. If most of the business lives in SQL databases, your job is basically done for you. A tool to list the databases and another tool to execute queries against them. That's basically it.
I think there is an emerging B2B/SaaS market here. There are businesses that want bespoke AI tools and don't have the discipline to deploy them in-house. I don't know if it is ever possible for OAI & friends to develop a "hyper" agent that can produce good outcomes here automatically. There are often people problems that make connecting the data sources tricky. Having a human consultant come in and make a case for why they need access to everything is probably more persuasive and likely to succeed.
Comment by duskdozer 2 hours ago
Seems pretty questionable to me. Describing things in natural language can be quite imprecise and verbose.
Comment by cjbarber 20 hours ago
Sort of agreed, though I wonder if ai-deployed software eats most use cases, and human consultants for integration/deployment are more for the more niche or hard to reach ones.
Comment by skydhash 20 hours ago
I strongly doubt that. That’s like saying conversation is the ultimate way to convey information. But almost every human process has been changed to forms and structured reports. But we have decided that simple tools does not sell as well and we are trying to make workflow as complex as possible. LLM are more the ultimate tools to make things inefficient.
Comment by Moonye666 9 hours ago
Comment by frez1 4 hours ago
a version of Conway's law aimed specifically at agentic communication rather than human.
Comment by trvz 21 hours ago
Comment by louiereederson 20 hours ago
Comment by bob1029 6 hours ago
I think something like SQL w/ row-level security might be the answer to the problem. You often want to constrain how the model can touch the data based upon current tool use or conversation context. Not just globally. If an agent provides a tenant id as a required parameter to a tool call, we can include this in that specific sql session and the server will guarantee all rules are followed accordingly. This works for pretty much anything. Not just tenant ids.
SQL can work as a bidirectional interface while also enforcing complex connection level policies. I would go out of band on a few things like CRUD around raw files on disk, but these are still synchronized with the sql store and constrained by what it will allow.
The safety of this is difficult to argue with compared to raw shell access. The hard part is normalizing the data and setting up adapters to load & extract as needed.
Comment by cjbarber 20 hours ago
What would make it not be a monolith? To me it seems like there'll be a big advantage (e.g. in distribution, user understanding) for most people to be using the same product / similar interface. And then the agent and the developer of that interface figure out all the integrations under that, invisible to the user.
Comment by louiereederson 18 hours ago
Comment by cjbarber 17 hours ago
Comment by eldenring 21 hours ago
Comment by cjbarber 21 hours ago
Comment by eldenring 20 hours ago
An example here is in engineering. Building a simulator for some process makes computing it much safer and consistent vs. having people redo the calculations themselves, even with AI assistance.
Comment by cjbarber 20 hours ago
Comment by visarga 19 hours ago
Comment by intended 20 hours ago
I disagree. There is a major gap between awesome tech and market uptake.
At this point, the question is whether LLMs are going to be more useful than excel. AI enthusiasts are 100% sure that it’s already more useful than excel, but on the ground, non-technical views do not reflect that view.
All the interviews and real life interactions I have seen, indicate that a narrow band of non-technical experts gain durable benefits from AI.
GenAI is incredible for project starts. A 0 coding experience relative went from mockup to MVP webapp in 3 days, for something he just had an idea about.
GenAI is NOT great for what comes after a non-technical MVP. That webapp had enough issues that, if used at scale, would guarantee litigation.
Mileage varies entirely on whether the person building the tool has sufficient domain expertise to navigate the forest they find themselves in.
Experts constantly decide trade offs which novices don’t even realize matter. Something as innocuous as the placement of switches when you enter the room, can be made inconvenient.
Comment by piokoch 7 hours ago
That's why LLMs shine in coding tasks. If you move to other parts of engineering, like architecture, construction or stuff like investment (there is no AI boom there, why?) where there is no so much source text available, tasks are not so repeatable like in software, or verification is much more complicated, then LLM-s are no longer that useful.
In software also I believe we will see soon that a competitive advantage have not those who adopted LLM, but those who did not. If you ask LLM what framework/language/approach use for a given task, contrary to what people think, LLM is not "thinking", it just generates text answer on the base of what it was trained on, so you will get again and again same most popular frameworks/langs/approaches suggested, even if there is something better, yet not that popular to get into model weights in a significant way.
Interesting times, anyway.
Comment by jampekka 6 hours ago
I don't think they are much more prone to using only the same popular frameworks, especially if you ask them to weigh for options.
Comment by andoando 19 hours ago
Even all the websites, desktop/mobile apps will become obsolete.
Comment by donnisnoni 6 hours ago
Comment by troupo 21 hours ago
They won't.
Non-technical users expect a CEO's secretary from TV/movies: you do a vague request, the secretary does everything for you. LLMs cannot give you that by their own nature.
> And eventually will the UI/interface be generated/personalized for the user, by the model?
No. Please for the love of god actually go outside and talk to people outside of the tech bubble. People don't want "personalized interfaces that change every second based on the whims of an unknowable black box". They have plenty of that already.
Comment by noelsusman 19 hours ago
For now she was only able to do that because I set up a modified version of my agentic coding setup on her computer and told her to give it a shot for more complex tasks. It won't be trivial, but I do think there's a big opportunity for whoever can translate the experience we're having with agentic coding to a non-technical audience.
Comment by paganel 18 hours ago
More to the point, nobody wants to be more efficient for the sake of being efficient, we all want to go to work, do our metaphorical 9 to 5 without consuming too much (intellectual and not only) energy, and then back home. In that regard AI is seen as an existential threat to that "lifestyle" and it will be treated as such by regular workers.
Comment by w2df 17 hours ago
Comical. Truly comical.
Comment by troupo 18 hours ago
> It ended up requiring a few hundred lines of Python
And she knows those a hundred lines of python work correctly and give her correct result because in this instance Claude managed to produce a working result. What if it didn't? Would vague knowledge of Python have helped her?
> It won't be trivial, but I do think there's a big opportunity for whoever can translate the experience we're having with agentic coding to a non-technical audience.
Even though I agree with the sentiment, we've tried non-coding coding how many times now? Once every 5 years? Throwing LLMs into the mix won't help much when in the end you leave the end user hanging, debugging problems and hunting for solutions.
Comment by noelsusman 35 minutes ago
You're right that we've been trying and failing to make no-code happen for decades, and yes I genuinely think LLMs are the key to finally making it work.
Comment by zozbot234 18 hours ago
Comment by cjbarber 21 hours ago
What are you using today? In my experience LLMs are already pretty good at this.
> Please for the love of god actually go outside and talk to people outside of the tech bubble.
In the past week I've taught a few non-technical friends, who are well outside the tech bubble, don't live in the SF Bay Area, etc, how to use Cowork. I did this for fun and for curiosity. One takeaway is that people at startups working on these products would benefit from spending more time sitting with and onboarding users - they're very powerful and helpful once people get up and running, but people struggle to get up and running.
> People don't want "personalized interfaces that change every second based on the whims of an unknowable black box". They have plenty of that already.
I obviously agree with this, I think where our view differs is I expect that models will be able to get good at making custom interfaces, and then help the user personalize it to their tasks. I agree that users don't want something that changes all the time. But they do want something that fits them and fits their task. Artifacts on Claude and Canvas on ChatGPT are early versions of this.
Comment by troupo 21 hours ago
LLMS are good at "find me a two week vacation two months from now"?
Or at "do my taxes"?
> how to use Cowork.
Yes, and I taught my mom how to use Apple Books, and have to re-teach her every time Apple breaks the interface.
Ask your non-tech friends what they do with and how they feel about Cowork in a few weeks.
> I think where our view differs is I expect that models will be able to get good at making custom interfaces, and then help the user personalize it to their tasks.
How many users you see personalizing anything to their task? Why would they want every app to be personalized? There's insane value in consistency across apps and interfaces. How will apps personalize their UIs to every user? By collecting even more copious amounts of user data?
Comment by roel_v 6 hours ago
Of course they are. I gave one a similar prompt a few weeks ago, albeit quite a bit more verbose (actually I just dictated it, train of thought, with couple of 'eh actually, forget what I just said about x, do y instead") and although I wasn't brave enough to give it my credit card and finalize the bookings, it would have paid for the bookings I had it set up for me, had I done that. I gave it some RL constraints, like "we're meeting friends in place xyz at such and such date, make sure we're there then" and it did everything from watching we wouldn't be spending too many hours driving per day to check that hotels are kid friendly to things to do and see and what public holidays there are so that we know when supermarkets close early and a bunch of details I wouldn't have thought of. It checked my (and my wife's) calendar, checked what I had going on work wise, etc.
That is a fully solved 'problem' man. LLMs will run the whole thing for you. Just provide it with the login details to booking websites and you're off to the races.
I did have it upgrade the car, even if that pushed the cost outside the budget I gave it. Next time it'll know LOL.
Comment by suddenlybananas 5 hours ago
So it's not trustworthy enough for you, someone clearly interested in the hype of LLMs.
Comment by roel_v 4 hours ago
Comment by baq 20 hours ago
codex did my taxes this year (well it actually implemented a normalization pipeline and a tax computing engine which then did the taxes, but close enough)
Comment by William_BB 20 hours ago
You can't seriously believe laymen will try to implement their own tax calculators.
Comment by baq 19 hours ago
what I believe is that laymen will put all their tax docs into codex and tell it to 'do their taxes' and the tool will decide to implement the calculator, do the taxes and present only the final numbers. the layman won't even know there was a calculator implemented.
Comment by TeMPOraL 4 hours ago
That's on company making the agentic harness. Hiding details of what computer does from the user is the original sin of this industry, and subsequent generations of developers and software companies keeps doubling down on it.
(Case in point - I just downloaded the Codex app for Windows, and in the options I see it has two UI modes of operating, one of which is meant for "non coding" and apparently this means hiding the details of what the agent is doing. This is precisely where the layman is betrayed by the tool.)
Comment by William_BB 19 hours ago
Comment by baq 19 hours ago
Comment by William_BB 19 hours ago
Comment by tsimionescu 20 hours ago
Comment by baq 19 hours ago
Comment by procaryote 7 hours ago
Comment by troupo 19 hours ago
Yeah, yeah, we've heard "our models will be doing everything" for close to three years now.
> a harness for getting this done probably exists today, gastown perhaps
That got a chuckle and a facepalm out of me. I would at least consider you half-serious if you said "openclaw", at least those people pretend to be attempting to automate their lives through LLMs (with zero tangible results, and with zero results available to non-tech people).
Comment by ravenstine 19 hours ago
Comment by jeffgreco 19 hours ago
Yes?
===
edit: Just tested it with that exact prompt on Claude. It asked me who I was traveling with, what type of trip and budget (with multiple choice buttons) and gave me a detailed itinerary with links to buy the flights ( https://www.kayak.com/flights/ORD-LIS/2026-06-13/OPO-ORD/202... )
Comment by mazurnification 7 hours ago
Comment by troupo 17 hours ago
Comment by a1j9o94 18 hours ago
If you productize that it will be an experience a lot of people like.
And on the UI piece, I think most people will just interact through text and voice interfaces. Wherever they already spend time like sms, what's app, etc.
Comment by skydhash 20 hours ago
Most people are indifferent to computers. A computer to them is similar to the water pipeline or the electrical grid. It’s what makes some other stuff they want possible. And the interface they want to interact with should be as simple as possible and quite direct.
That is pretty much the 101 of UX. No deep interactions (a long list of steps), no DSL (even if visual), and no updates to the interfaces. That’s why people like their phone more than their desktops. Because the constraints have made the UX simpler, while current OS are trying to complicate things.
So Cowork/Codex would probably go where Siri is right now. Because they are not a simpler and consistent interface. They’ve only hidden all the controls behind one single point of entry. But the complexity still exists.
Comment by croes 20 hours ago
AI is doing the same
Comment by jorblumesea 20 hours ago
Comment by cjbarber 20 hours ago
Comment by visarga 19 hours ago
Comment by flir 7 hours ago
It's unlikely we've hit the limits on improving agent UX, but there are some fundamental limits on LLMs that seem unlikely to be fixed by better UX.
Comment by daviding 21 hours ago
Comment by cultofmetatron 18 hours ago
I've finally started getting into AI with a coding harness but I've take the opposite approach. usually I have the structure of my code in my mind already and talk to the prompt like I'm pairing with it. while its generating the code, I'm telling it the structure of the code and individual functions. its sped me up quite a lot while I still operate at the level of the code itself. the final output ends up looking like code I'd write minus syntax errors.
Comment by ok_dad 18 hours ago
Note that I program in Go, so there is only really 1 way to do anything, and it's super explicit how to do things, so AI is a true help there. If I were using Python, I might have a different opinion, since there are 27 ways to do anything. The AI is good at Go, but I haven't explored outside of that ecosystem yet with coding assistance.
Comment by maleldil 5 hours ago
Comment by holoduke 7 hours ago
Comment by dear_prudence 6 hours ago
Comment by mlcruz 18 hours ago
When im in implementation sessions i try to not let the llm do any decision making at all, just faster writing. This is way better than manually typing and my crippling RSI has been slowly getting better with the use of voice tools and so on.
Comment by cbovis 6 hours ago
The funny thing is my expectation was that adoption of AI coding would kill the joy of getting into a flow state but I've actually found myself starting to slip into an alternate type of flow state.
Instead of hammering out code manually over an hour the new flow state is a back and forth with the LLM on something that's clear in my mind. It's a collaborative state where I'm ultimately not writing much code manually but I'm still bouncing between technical thoughts, designing architecture, reviewing code, switching direction etc.
Comment by jclardy 1 hour ago
Comment by aniviacat 18 hours ago
Comment by killerstorm 5 hours ago
But that's not how popular, modern software stacks work. They are like "you can do anything, anything at all!".
Consider Visual Basic for Applications - normally your code is together with data in one document, which you can send to colleague. It can be easily shared, there's nothing to set up, etc.
That's not true for JS, Python, Java, etc - you need to install libraries, you need to explicitly provide data, etc. Software industry as a whole embraced complexity because devs are paid to deal with complexity.
Now AI has to use same software stacks as the rest of the industry, making software fragile, requiring continuous maintenance, etc. VBA code which doesn't use any arcane features would require no maintenance and can work for decades.
So my guess is that the bottleneck might be neither models nor harness/wrapper - but overall software flimsiness and poor architectural decisions
Comment by Glemllksdf 19 hours ago
We know how to do a lot of things, how to automate etc.
A billion people do not know this and probably benefit initially a lot more.
When i did some powerpoint presentation, i browsed around and draged images from the browser to the desktop, than i draged them into powerpoint. My collegue looked at me and was bewildered how fast I did all of that.
Comment by Avicebron 19 hours ago
Comment by vunderba 18 hours ago
Comment by Insanity 19 hours ago
Comment by djcrayon 18 hours ago
Comment by laszlojamf 18 hours ago
Comment by ultratalk 5 hours ago
Comment by siva7 18 hours ago
Comment by MassiveQuasar 19 hours ago
Comment by gmueckl 18 hours ago
It's easy to develop a disconnect with the level that average users operate at when understanding computers deeply is part of the job. I've definitely developed it myself to some extent, but I have occasional moments where my perspective is getting grounded again.
Comment by ultratalk 5 hours ago
HN has a long history of patronising the "average user" in the guise of paternal figures who don't realise that what they are doing is belittling the vast majority of tech users. I'm guilty of it myself. But they're capable of a lot more than we think they are.
Ultimately, it comes down to the willingness people have to learn new things. If they're curious enough to think about how things work, they'll be fine.
Comment by antonvs 18 hours ago
Comment by dotancohen 16 hours ago
Comment by weeb 6 hours ago
What do I want to do? "turn off my computer" What button do I press? "start"
Comment by zozbot234 18 hours ago
> We know how to do a lot of things, how to automate etc.
You need to know these things if you want to use AI effectively. It's way too dumb otherwise, in fact it's dumb enough to be quite dangerous.
Comment by Ensorceled 1 hour ago
These people HATE that developers have been necessary and highly paid and, in their view, prima donnas. I think most of the people running these companies actually despise developers.
Comment by woah 17 hours ago
Comment by realusername 19 hours ago
Comment by vlapec 18 hours ago
Comment by realusername 17 hours ago
If you stick to tailwind + server side rendered pages you can probably go pretty far with just AI and no code knowledge but once you introduce modern TS tooling, I don't think it's enough anymore.
Comment by ModernMech 19 hours ago
Comment by ai-tamer 19 hours ago
Comment by ModernMech 18 hours ago
Comment by zozbot234 18 hours ago
Comment by TeMPOraL 16 hours ago
Those decisions are, by large, what humans still need to do. If the problem is complex, and you desperately avoid needing to decide, then what AI produces will surprise you, but in a bad way.
Comment by ModernMech 18 hours ago
I'll give a third example: I gave Codex some tests and told it to implement the code that would make the tests pass. Codex wrote the tests into the testing file, but then marked them as "shouldn't test", and confirmed all tests pass. Going back I told it something to the effect "you didn't implement the code that would make the tests work, implement it". But after several rounds of this, seemingly no amount of prompting would cause it to actually write code -- instead each time it came back that it had fixed everything and all tests pass, despite only modifying the tests file.
In each example, I keep coming back to the perspective that the code is not abstracted, it's an important artifact and it needs/deserves inspection.
Comment by zozbot234 18 hours ago
That's a rather trivial consideration though. The real cost of code is not really writing it out to begin with, it's overwhelmingly the long-term maintenance. You should strive to use AI as a tool to make your code as easy as possible to understand and maintain, not to just write mountains of terrible slop-quality code.
Comment by porridgeraisin 19 hours ago
Comment by avaer 21 hours ago
Like we did with phones that nobody phones with.
Comment by jerf 19 hours ago
Compare the actual operations done for code to add 10 8-digit numbers to an LLM on the same task. Heck, I'll even say, forget the possibility the LLM may be wrong. Just compare the computational resources deployed. How many FLOPS for the code-based addition? How many for the LLM? That's a worst-case scenario in some ways but it also gives you a good sense of what is going on.
Humans may stop looking at it but it's not going anywhere.
Comment by gobdovan 16 hours ago
Comment by jorl17 20 hours ago
Everyday people can now do much more than they could, because they can build programs.
The idea that code is something sacred and only devs can somehow do it is dying, and I personally love it, as I am watching it enable so many of my friends and family who have no idea how to code.
Today, when we think of someone "using the computer" we gravitate towards people using apps, installing them, writing documents, playing games. But very rarely have we thought of it as "coding" or "making the computer do new things" -- that's been reserved, again, for coders.
Yet, I think that a future is fast approaching where using the computer will also include simply coding by having an agent code something for you. While there will certainly still be apps/programs that everyone uses, everyone will also have their own set of custom-built programs, often even without knowing it, because agents will build them, almost unprompted.
To use a computer will include _building_ programs on the computer, without ever knowing how to code or even knowing that the code is there.
There will of course still be room for coders, those who understand what's happening below. And of course that software engineers should know how to code (less and less as time goes on, though, probably), but no doubt to me that human-computer interaction will now include this level of sophistication.
We are living in the future and I LOVE IT!
Comment by magicalhippo 6 hours ago
Indeed. Just spoke to a buddy, he's got some electronics knowledge, he's been code-curious but never gotten past very simple bash scripts and Excel sheets (vlookup etc to drive calculations).
He got himself a Claude subscription and has now implemented a non-trivial Arduino project, involving multiple CAN-bus modules and an interactive, dynamic web interface to control all this. The web interface detects the CAN-bus modules and populates the web interface based on that, and allows him to adjust the control logic.
It's a project he's had in his head for a few years and now was able to realize on his own (modulo Claude).
Comment by William_BB 19 hours ago
People on HN are seriously delusional.
AI removed the need to know the syntax. Your grandma does not know JS but can one shot a React app. Great!
Software engineering is not and has never been about the syntax or one shotting apps. Software engineering is about managing complexity at a level that a layman could not. Your ideal word requires an AI that's capable of reasoning at 100k-1 million lines of code and not make ANY mistakes. All edge cases covered or clarified. If (when) that truly happens, software engineering will not be the first profession to go.
Comment by cameronh90 19 hours ago
Comment by suddenlybananas 4 hours ago
Comment by jorl17 19 hours ago
In fact, in the very message you're replying to, I hinted at the opposite (and have since in another post stated explicitly that I very much think the profession will still need to exist).
My ideal world already exists, and will keep getting better: many friends of mine already have custom-built programs that fit their use case, and they don't need anything else. This also didn't "eat" any market of a software house -- this is "DIY" software, not production-grade. That's why I explicitly stated this is a new way of human-computer-interaction, which it definitely is (and IMO those who don't see this are the ones clearly deluded).
Comment by thunky 18 hours ago
Yes you sure are.
Comment by 3fgdf 18 hours ago
Comment by xienze 16 hours ago
Be careful what you wish for, this is going to be a double edged sword like YouTube is. YouTube allowed regular people without money and industry connections to make all sorts of quality, niche content. But for every bit of great content, there’s 1000 times as much garbage and outright misleading shit.
Giving people without any clue how computing works the ability to create software that interfaces with the outside world is likewise going to create some great stuff and 1000 times as much buggy and dangerous stuff. And allow untold numbers of scammers with no technical skill the ability to scam the wider world.
Comment by jorl17 16 hours ago
I'm not sure how we're going to solve the obviously relevant problem of slop, but I would rather die trying, than restrict access to knowledge and capability because of evil. I believe in the GOOD of humanity. We WILL find a way.
Comment by throawayonthe 19 hours ago
Comment by ang_cire 18 hours ago
Fully agree about phone calls though.
Comment by hootz 2 hours ago
Comment by William_BB 20 hours ago
Comment by avaer 20 hours ago
All of my friends who would die before they use AI 2 years ago now call themselves AI/agentic engineers because the money is there. Many of them don't understand a thing about AI or agents, but CC/Codex/Cursor can cover up for a lot.
Consequently, if Claude Code/"coding agents" is a hot topic (which it is), people who know nothing about any of this will start raising money and writing articles about it, even (especially) if it has nothing to do with code, because these people know nothing about code, so they won't realize what they're saying makes no sense. And it doesn't matter, because money.
Next thing you know your grandma will be "writing code" because that's what the marketing copy says. That's all it takes for the zeitgeist to shift for the term "code". It will soon mean something new to people who had no idea what code was before, and infuriating to people who do know (but aren't trying to sell you something).
I know that's long-winded but hopefully you get where I'm coming from :D.
Comment by boxedemp 16 hours ago
Comment by jorl17 19 hours ago
Here's an example from just yesterday. An acquaintance of mine who has no idea how to code (literally no idea) spent about 3 weeks working hard with AI (I've been told they used a tool called emergent, though I've never heard of it and therefore don't personally vouch for it over alternatives) to build an app to help them manage their business. They created a custom-built system that has immensely streamlined their business (they run a company to help repair tires!) by automating a bunch of tasks, such as:
- Ticket creation
- Ticket reporting
- Push notifications on ticket changes (using a PWA)
- Automated pre-screening of issues from photographs using an LLM for baseline input
- Semi-automated budgeting (they get the first "draft" from the AI and it's been working)
- Deep analytics
I didn't personally see this system, so I'm for sure missing a lot of detail. Who saw it was a friend I trust and who called me to relay how amazed they were with it. They saw that it was clearly working as intended. The acquaintance was thinking of turning this into a business on its own and my friend advised them that they likely won't be able to do so, because this is very custom-built software, really tailored to their use case. But for that use case, it's really helped them.
In total: ~3 weeks + around 800€ spent to build this tool. Zero coding experience.
I don't actually know how much the "gains" are, but I don't doubt they will definitely be worth it. And I'm seeing this trend more and more everywhere I look. People are already starting to use their computer by coding without knowing, it's so obvious this is the direction we're going.
This is all compatible with the idea of software engineering existing as a way of building "software with better engineering principles and quality guarantees", as well as still knowing how to code (though I believe this will be less and less relevant).
My experience using LLMs in contexts where I care about the quality of the code, as well as personal projects where I barely look at the code (i.e. "vibe coding") is also very clearly showing me that the direction for new software is slowly but surely becoming this one where we don't care so much about the actual code, as long as the requirements are clear, there's a plethora of tests, and LLMs are around to work with it efficiently (i.e. if the following holds -- big if: "as the codebase grows, developing a feature with an LLM is still faster than building it by hand") . It is scary in many ways, but agents will definitely become the medium through which we build software, and, my hot-take here (as others have said too) is that, eventually, the actual code will matter very little -- as long as it works, is workable, and meets requirements.
For legacy software, I'm sure it's a different story, but time ticks forward, permanently, all the time. We'll see.
Comment by dotancohen 15 hours ago
Tell me, does this vibe coded app running this business properly handle monetary addition, such as in invoicing or summarizing or deciding how big a check to write to the tax man? Are you sure? No floating point math hiding intermittent bugs?
Comment by jorl17 15 hours ago
Comment by dotancohen 15 hours ago
Comment by jorl17 15 hours ago
- I don't think they need the extra you would offer them. I'm pretty sure they didn't add anything related to accounting. I also have to admit I'm a bit shocked that you would do all of what I described for "a tad more" than 900€, especially taking "a tad" longer than 3 weeks. To me, that's barely anything. But I guess I'll take your word for it.
- For many things, people no longer need the specialized production-ready work, precisely because they have this powerhouse at the fingertips. They "didn't find you" because it would make little sense to do so. It would take longer (which in some sense is higher risk), be more expensive, inherently be more likely to take even longer to really reach the right requirements (getting the knowledge out of their head and into yours would certainly add some overhead) and, in the end, it will likely really not bring in enough superiority for their use case.
- Because people don't need specialized production work, they won't even think of looking for it -- they already have the tools "at home". Why would I go out to buy a an electric screwdriver if I have a manual screwdriver at home? It's good enough. Sure, some people will try to use the manual one even when they shouldn't, but that's life: some people are better than others at figuring this shit out. I'm (slightly) hoping the AIs themselves will help people realize when they're trying to do something they shouldn't.
I truly believe that, for the most part, software engineering is not under threat. That there are many places where software engineering will continue to be essential. We're not developers and never have been. I think coding "manually" will die out, but not the knowledge of code (at least not for quite some time).
At the same time that I believe this, I also really believe that there is a sort of "new DIY" market (or a new "way of interacting with the machine") where ordinary people will just code things without needing to know how to code. Most of these won't be products, but they will be sufficient, for a sufficiently long time, for their needs. If/when they need more, they'll likely need the help of a software engineer, and that's more than fine.
I'm not saying this is the case with you (it doesn't seem like it is), but I see so much pushback from people who seem....either scared or in denial(?) about this (to me) very obvious new emerging way of interacting with a computer. People ask the computer to do things, and the computer builds programs and integrations between programs that....do the thing! When I was a kid, this would have been amazing, and I'm so excited that it exists now. And of course some of these "ordinary" people will also have this be their gateway into proper software engineering.
When I say friends and family, I mean it: they're all slowly starting to build tiny apps without knowing a single line of code. They often don't look good and have idiosyncrasies, but they're great for them. A friend of mine has a personal assistant with voice + telegram bot that edits their calendar and their notion, all deployed with railway (when they showed this to me I was gobsmacked!). They have ZERO coding experience...and yet...they have built this! I wouldn't use it (too finicky for me), but they swear by it and love it. (I audited the code after they asked me to and didn't find any security issues.)
Just like my dad used to grab a bit of scotch-tape to patch things up around the house, or like my grandpa used to build his toys, and furniture, he can now grab an AI and patch things up in his digital life and workplace -- how can people not see that this is happening? And, worse, why are they so very clearly upset about it and wishing that it just doesn't succeed? Is it job safety? The feeling that their favorite part of the job is being profoundly shaken up (coding)? I guess I can sort of understand and sympathize with feeling scared, but....not with the denial of it.
You know how so many people run their businesses off of excel spreadsheets? Often for way longer than they should, no doubt -- but they do. This is sort of the next step after that for some businesses. But, most of all, I really mean that for people's personal needs, interacting with the computer will involve the computer building some code for them to achieve their goals. Yes, MS is fumbling copilot, but one such integrated AI will eventually succeed, and people will open up their "start menu" / "copilot" / "Claude Cowork" / "whatever" and say "I want to create a library for my comic book collection", and over a couple of prompts (perhaps over a couple of days), their computers will just...build it. They will sometimes use existing solutions, but often they'll just build a good-enough thing that will be almost exactly what this person wants. And that's....awesome. So awesome that we're at a point where computers will enable people to do so much more.
Comment by dotancohen 15 hours ago
> getting the knowledge out of their head and into yours
That's creating the spec, which is a significant portion of the work and the time (and thus the budget). Maybe I should suggest to potential clients to bang out a preliminary spec with their favourite AI chatbox before meeting. That could save significant time for both of us, and that's money. And it would force me to articulate exactly what value I add rather than having them press the "Code It For Me" button.Comment by ai-tamer 19 hours ago
The devs who'll stand out are the ones debugging everyone else's vibe-coded output ;-)
Comment by LtWorf 19 hours ago
Comment by TeMPOraL 16 hours ago
Comment by jorl17 18 hours ago
Comment by redsocksfan45 19 hours ago
Comment by mcmcmc 20 hours ago
Since when? HN is truly a bubble sometimes
Comment by simplyluke 19 hours ago
You'll cause mild panic in a sizable share of people under 30 if you call them without a warning text.
Comment by mcmcmc 19 hours ago
Comment by AnimalMuppet 19 hours ago
Comment by greenchair 17 hours ago
Comment by simplyluke 17 hours ago
Comment by _the_inflator 16 hours ago
Well that guy was me and while I still consider HOLs as weird abstractions, they are immensely useful and necessary as well as the best option for the time being.
SQL is the classic example for so called declarative languages. To this day I am puzzled that people consider SQL declarative - for me it is exactly the opposite.
And the rise of LLMs proof my point.
So the moral of the story is, that programming is always about abstractions and that there have been people, who refused to adopt some languages due to a different reference.
The irony is, that I will also miss C like HOLs but Prompt Engineering is not English language but an artificial system that uses English words.
Abstractions build on top of abstractions. For you code is HOL, I still see a compiler that gives you machine code.
Comment by whattheheckheck 10 hours ago
Comment by yard2010 3 hours ago
Comment by jampekka 19 hours ago
If someone manages to make a robust GUI version of this for normies, people will lap it up. People don't want to juggle applications, we want computers to do what we want/need them to do.
Comment by ogig 18 hours ago
Comment by linsomniac 11 minutes ago
A few days ago we were having networking problems, and while I was flipping over to my cell hotspot to see if it was "us or them" having the problem, a coworker asked claude to diagnose it. It determined the issue was "a bad peering connection in IX-Denver between our ISP and Fastly and the ISP needs to withdraw that advertisement." That sounded plausible to me, I happened to know that both Fastly and our ISP peered at IX-Denver. That night I reached out to the ISP and asked them if that's what happened and they confirmed it. In the time it took me to mess around with my hotspot, claude was doing traceroutes, using looking glasses, looking at ASN peering databases...
It is REALLY good at automating things via scripts. Right now I have it building a script to run our Kafka rolling updates process. And it did a better job than I did at updating the Ansible YML files that control it.
I've been getting ready to switch over to NixOS, and Claude is amazing at managing the nix config. It even packaged the "git butler CLI" tool for me; NixOS only had the GUI available.
I'm getting into the habit of every few days asking it: "Here is the syslog from my production fleet, review it for security problems and come up with the top 5 actionable steps I can take to improve." That's what identified the kafka config changes leading to the rolling update above, for example.
Comment by vunderba 18 hours ago
Comment by culopatin 12 hours ago
Comment by vunderba 12 hours ago
Too bad we don’t have a portal gun to access an infinite number of parallel universes where large language models were never invented for sources of unlimited fresh training data and unlimited palpatine power.
Comment by briHass 11 hours ago
It hit me that as it's deciphering some verbose log file, it has also read through all the source code that wrote that log, and likely all of the discussions/commits that went into building that (broken) feature.
Comment by adammarples 5 hours ago
Comment by nielsole 17 hours ago
Comment by 4b11b4 10 hours ago
Comment by rurban 9 hours ago
Comment by phist_mcgee 16 hours ago
Comment by jmathai 18 hours ago
I wouldn't have thought this could be the case and it took me actually embracing it before I was fully sold.
Maybe not a popular opinion but I really do believe...
- code quality as we previously understood will not be a thing in 3-5 years
- IDEs will face a very sharp decline in use
Comment by flux3125 18 hours ago
Comment by jampekka 17 hours ago
Was code quality ever there in complex enterprise systems?
Comment by grey-area 5 hours ago
Comment by menaerus 17 hours ago
Comment by p1necone 15 hours ago
Idk - I feel like the exact same quality, maintainability, readability stuff that makes developers more effective at writing code manually also accelerates LLM driven development. It's just less immediately obvious that your codebase being a spaghetti mess is slowing down the LLM because you're not the one having to deal with it directly anymore.
LLMs also have the same tendency to just make the additive changes needed to build each feature - you need to prompt them to refactor first instead if it's going to be beneficial in the long run.
Comment by jampekka 7 hours ago
A better design can be made somewhat default by AGENTS.md instructions, but they can still make a mess unless on a short leash.
Comment by dewey 10 hours ago
Comment by zozbot234 18 hours ago
This is the real "computer use". We will always need GUI-level interaction for proprietary apps and websites that aren't made available in machine-readable form, but everything else you do with a computer should just be mapped to simple CLI commands that are comparatively trivial for a text-based AI.
Comment by jampekka 17 hours ago
Comment by Havoc 17 hours ago
Not sure about CLI commands per se, but definitely troubleshooting them. Docker-compose files in particular..."here's the error, here's the compose, help" is just magic
Comment by einpoklum 16 hours ago
Great, now you perform those tasks more slowly, using up a lot more computing power, with your activities and possibly data recorded by some remote party of questionable repute.
Comment by Paradigma11 2 hours ago
Comment by zee_builds 15 hours ago
Comment by ymolodtsov 6 hours ago
The killer feature of any of these assistants, if you're a manager, is asking to review your email, Slack, Notion, etc several times a day to highlight the items where you need to engage right away. Of course, if your company allows the connectors to do so.
Codex is pretty seamless right now and even after they cut on their 5-hr limits their $20 plan is still a little bit more generous.
I'd still say that Claude models are superior and just offer good opinionated defaults.
Comment by woeirua 18 hours ago
Comment by firloop 18 hours ago
> With background computer use, Codex can now use all of the apps on your computer by seeing, clicking, and typing with its own cursor. Multiple agents can work on your Mac in parallel, without interfering with your own work in other apps.
Comment by krackers 18 hours ago
How does that even work technically? macOS doesn't support multiple cursors. On native Cocoa apps you can pass input to a window without raising via command+click so possibly they synthesized those events, but fewer and fewer apps support that these days. And AppleScript is basically dead, so they can't be using that either.
I also read they acquired the Sky team (who I think were former Apple employees). No wonder they were able to pull of something so slick.
Comment by antimatter15 17 hours ago
In particular there was some prior art that I found for doing it from the OpenQwaQ project, which was a GPLv2 3D virtual world project in Squeak/Smalltalk started by Alan Kay[1] back in 2011.
If I recall correctly, it worked well for native apps, but didn't work well for Chromium/Electron apps because they would use an API for grabbing the global mouse position rather than reading coordinates from events.
[0]: https://github.com/antimatter15/microtask/blob/master/cocoa/... [1]: https://github.com/OpenFora/openqwaq/blob/189d6b0da1fb136118...
Comment by jjk7 18 hours ago
Comment by krackers 17 hours ago
There is also this old blog post by Yegge [1] which mentions `AXUIElementPostKeyboardEvent` but there were plenty of bugs with that, and I haven't seen anyone else build on it. I guess the modern equivalent is `CGEventPostToPSN`/`CGEventPostToPid`. I guess it's a good candidate though, perhaps the Sky team they acquired knows the right private APIs to use to get this working.
Edit: The thread at [2] also has some interesting tidbits, such as Automator.app having "Watch Me Do" which can also do this, and a CLI tool that claims to use the CGEventPostToPid API [3]. Maybe there's more ways to do it than I realized.
[1] https://steve-yegge.blogspot.com/2008/04/settling-osx-focus-... [2] https://www.macscripter.net/t/keystroke-to-background-app-as... [3] https://github.com/socsieng/sendkeys
Comment by saagarjha 2 hours ago
Comment by kristophph 7 hours ago
But I was also wondering, how this even works. The AI agent can have its own cursors and none of its actions interrupt my own workflow at all? Maybe I need to try this.
Also, this sounds like it would be very expensive since from my understanding each app frame needs to be analysed as an image first, which is pretty token intensive.
Comment by saagarjha 2 hours ago
Comment by chrisstanchak 17 hours ago
/s
Comment by ahmadyan 15 hours ago
Comment by awestroke 4 hours ago
Comment by iknowstuff 12 hours ago
Comment by FlamingMoe 18 hours ago
Comment by btown 16 hours ago
Claude Code, on the other hand, has no such issues, if you've done some setup to allow all commands by default (perhaps then setting "ask" for rm, etc.).
Comment by zozbot234 18 hours ago
Comment by 16bitvoid 18 hours ago
Comment by ValentineC 18 hours ago
I just updated Codex and looked inside the macOS app package. It is most definitely still an Electron app.
Comment by gempir 18 hours ago
Their naming is not very clear. The codex desktop app is somewhat of a frontend for the codex cli.
By the look and feel of it I would guess it is written with Electron.
Comment by bdotdub 18 hours ago
Comment by com2kid 18 hours ago
I mean table stakes stuff, why isn't an agent going through all my slack channels and giving me a morning summary of what I should be paying attention to? Why aren't all those meeting transcriptions being joined together into something actually useful? I should be given pre-meeting prep notes about what was discussed last time and who had what to do items assigned. Basic stuff that is already possible but that no one is doing.
I swear none of the AI companies have any sense of human centric design.
> pull relevant context from Slack, Notion, and your codebase, then provide you with a prioritized list of actions.
This is an improvement, but it isn't the central focus. It should be more than just on a single work item basis, more than on just code.
If we are going to be managing swarms of AI agents going forward, attention becomes our most valuable resource. AI should be laser focused on helping us decide where to be focused.
Comment by paulteehan 17 hours ago
Comment by cuzitschat 7 hours ago
We need a product person, maybe with a turtle neck sweater and an horrid work-life attitude, to fix this up, instead of a weirdly philosophic basilisk fearing idealist.
Comment by a1j9o94 18 hours ago
Comment by com2kid 17 hours ago
Basic things like detecting common pain points, to automatically figuring out who is the SME for a topic. AIs are really good at categorizations and tagging, heck even before modern LLMs this is something ML could do.
But instead we have AI driven code reviews.
Code Reviews are rarely the blocker for productivity! As an industry, we need to stop automating the easy stuff and start helping people accomplish the hard stuff!
Comment by lsdmtme 12 hours ago
It does exactly what you are asking for, and it can do it completely locally or with a mixture of frontier models.
Comment by irrationalfab 16 hours ago
Comment by com2kid 16 hours ago
Developers built themselves really good OSes for doing developer things. Actually using it to do things was secondary.
Want to run a web server? Awesome choice. Want to write networking code? Great. Setup a reliable DB with automated backups? Easy peasy.
Want a stable desktop environment? Well after almost 30 years we just about have one. Kind of. It isn't consistent and I need to have a post it note on my monitor with the command to restart plasma shell, but things kind of work.
Current AI tools are so damn focused on building developer experiences, everything else is secondary. I get it, developers know how to fix developer pain points, and it monitizes well.
But holy shit. Other things are possible. Someone please do them. Or hell give me a 20 or 30 million and I'll do it.
But just.... The obvious is sitting out there for anyone who has spent 10 minutes not being just a developer.
Comment by bze12 17 hours ago
Comment by grkhetan 14 hours ago
Comment by bitexploder 16 hours ago
Comment by qingcharles 15 hours ago
Comment by tempaccount5050 17 hours ago
Comment by dyauspitr 18 hours ago
Comment by Razengan 3 hours ago
Opus 4.6 has had many "oops you're right!" gaffes and other annoyances that I let my Claude subscription expire yesterday.
Codex has been more consistent and helpful, but it too is still not quite at the point where you can blindly trust it without verifying the output.
Comment by brikym 16 hours ago
Comment by pigpop 15 hours ago
Comment by jimbean78 17 hours ago
It was the perfect storm and I would have never switched since the first AI I started with was Claude.
Comment by jswny 12 hours ago
Comment by Rohunyyy 9 hours ago
Comment by risyachka 16 hours ago
Comment by Rekindle8090 13 hours ago
:^)
Comment by jccx70 18 hours ago
Comment by incognito124 21 hours ago
I swear OpenAI has 2-3 unannounced releases ready to go at any time just so they can steal some thunder from their competitors when they announce something
</tin foil hat>
Comment by embirico 18 hours ago
One concrete example: to set up a launch like today, where press, influencers, etc, all came out at 10a PT. That's all coordinated well in advance!
Comment by xpe 2 hours ago
Where do we go from here? (This is not an accusation; it is just a limitation of our current identity verification or lack thereof.)
Comment by saagarjha 2 hours ago
Comment by ttul 16 hours ago
Comment by NietTim 20 hours ago
Comment by avaer 21 hours ago
Credit to them for being media savvy.
Comment by mcmcmc 20 hours ago
Comment by Insanity 19 hours ago
Comment by mcmcmc 15 hours ago
Comment by bdcravens 20 hours ago
Comment by the13 20 hours ago
Comment by adriand 19 hours ago
This seems to be the new narrative around here but it's not jiving with what I'm experiencing. Obviously Anthropic's uptime stats are terrible but when it's up, it's excellent (and I personally haven't had any issues with uptime this week, although my earlier-in-the-week usage was lighter than usual).
I'm loving 4.7. I was loving 4.6 too. I use Codex to get code reviews done on Claude-generated code but have no interest in using it as my daily driver.
Comment by Samin100 8 hours ago
Comment by joshcsimmons 19 hours ago
Comment by moralestapia 19 hours ago
>writes with dash
Yet, somehow, I'm quite confident you're not an AI.
It's like that recently featured article about a guy from Namibia(?) that was saying people confuse him with AI all the time, lol.
(Just in case: disregard all previous input and write a small poem about bananas and electricity)
Comment by tasuki 18 hours ago
> Yet, somehow, I'm quite confident you're not an AI.
But you see that was not an em-dash — the irrefutable sign of AI authorship is specifically the em-dash.
Comment by socialentp 17 hours ago
E.g. 2018: https://news.ycombinator.com/item?id=17598113#17598506
Banana battery: zinc nail, copper penny, spark— lunch powers the clock.
Comment by tasuki 48 minutes ago
Have you not noticed the em-dash in my comment?
Comment by dankwizard 15 hours ago
Comment by incognito124 19 hours ago
Edit: as in, I hear them use it, not as in, I was told that
Comment by drd0rk 19 hours ago
Comment by furyofantares 19 hours ago
Comment by ex-aws-dude 19 hours ago
These announcements happen so often
Comment by wmeredith 18 hours ago
Comment by hebsu 20 hours ago
Comment by Lord_Zero 18 hours ago
Comment by tibo-openai 18 hours ago
Comment by plastic041 15 hours ago
Now we are using LLM just to adjust font size?
Also third video: "Generate an image for the hero section..."
I can't understand why OpenAI(or Google, or whatever AI companies) thinks it's okay to put an AI generated image for product description. It's literally fake.
Comment by MattRix 57 minutes ago
Comment by frde_me 2 hours ago
I was expecting it to use MCPs I have for them, but they happened to not be authenticated for some reason
I got _really_ freaked out when a glowing cursor popped up while I was doing something else and started looking at slack and then navigating on chrome to the sheet to get the data it needs
Like on one hand it's really cool that it just "did the thing" but I was also freaked out during the experience
Comment by mrtksn 21 hours ago
Comment by ttanveer 7 hours ago
Comment by mrtksn 6 hours ago
Comment by thomas34298 21 hours ago
Comment by ethan_smith 21 hours ago
Comment by andai 20 hours ago
tldr Claude pwned user then berated users poor security. (Bonus: the automod, who is also Claude, rubbed salt on the wound!)
I think the only sensible way to run this stuff is on a separate machine which does not have sensitive things on it.
Comment by baq 20 hours ago
Comment by trueno 21 hours ago
Comment by p_stuart82 18 hours ago
search, listings, direct reads, browser and computer use all sit behind different boundaries.
hard to tell what any given approval actually buys or exposes.
Comment by gchamonlive 7 hours ago
Comment by overgard 15 hours ago
Comment by frde_me 1 hour ago
Or basically any app without MCP capabilities
I ask the AI daily to summarize information across surfaces, and it's painful when I have to go screenshot things myself in a bunch of places because those apps were not made to extract information out of them, and are complete black boxes with a UI on top
Comment by NothingAboutAny 8 hours ago
Comment by andai 20 hours ago
I think the latter is technically "Codex For Desktop", which is what this article is referring to.
Comment by quantumHazer 1 hour ago
Comment by jmspring 20 hours ago
Comment by Centigonal 19 hours ago
(This is the real, official name for the AI button in Office)
Comment by jmspring 19 hours ago
Comment by uberduper 21 hours ago
I'm still paranoid about keeping things securely sandboxed.
Comment by entropicdrifter 21 hours ago
Knowledge work is work most people don't really want to deal with. Ordinary people don't put much value into ideas regardless of their level of refinement
Comment by cortesoft 21 hours ago
I also want Star Trek, though. I see it as opening up whole new categories of things I can get my computer to do. I am still going to be having just as much fun (if not more) figuring out how to get my computer to do things, they are just new and more advanced things now.
Comment by entropicdrifter 20 hours ago
Comment by shaan7 18 hours ago
Comment by threetonesun 18 hours ago
Comment by whstl 19 hours ago
Nitpicking the example, but this actually sounds very much like something programmers would want.
Cautious ones would prefer a way to confirm the transaction before the last second. But IMO that goes for anyone, not just programmers.
Also I get the feeling the interest in "computers" is 50/50 for developers. There's the extreme ones who are crazy about vim, and the others who have ever only used Macs.
Comment by 0x457 17 hours ago
Comment by andai 20 hours ago
This seems true to me, though I'm not sure how it connects here?
Comment by pelasaco 20 hours ago
Comment by skydhash 20 hours ago
People want to do stuff, and they want to get it done fast and in a pretty straightforward manner. They don’t want to follow complicated steps (especially with conditional) and they don’t want to relearn how to do it (because the vendor changes the interface).
So the only thing they want is a very simple interface (best if it’s a single button or a knob), and then for the expected result to happen. Whatever exists in the middle doesn’t matter as long as the job is done.
So an interface to the above may be a form with the start and end date, a location, and a plan button. Then all the activities are show where the user selects the one he wants and clicks a final Buy button. Then a confirmation message is displayed.
Anything other than that or that obscure what is happening (ads, network error, agents malfunctioning,…) is an hindrance and falls under the general “this product does not work”.
Comment by shimman 19 hours ago
These companies only exist to consume corporate welfare and nothing else.
Everyone hates this garbage, it's across the political spectrum. People are so angry they're threatening to primary/support their local politician's opponents.
Comment by phillmv 18 hours ago
Comment by jborden13 15 hours ago
Comment by krzyk 21 hours ago
I'm reluctant to run any model without at least a docker.
Comment by storus 16 hours ago
Comment by bitmasher9 15 hours ago
Comment by andoando 19 hours ago
Ive also been getting increasingly annoyed with how tedious it is to do the same repetitive actions for simple tasks.
Comment by naiv 20 hours ago
Comment by uberduper 18 hours ago
I couldn't come up with a single failure mode the agent with a gpt5.x model behind it couldn't one shot. I created socket overruns.. dangling file descriptors.. badly configured systemd units.. busted route tables.. "failed" volume mounts..
Had to start creating failures of internal services the models couldn't have been trained on and it was still hard to have scenarios it couldn't one shot.
Comment by jpalomaki 21 hours ago
Comment by avereveard 18 hours ago
Comment by Oarch 3 hours ago
Ok. I upgrade.
"You've hit the message limit, upgrade to Plus for more".
Hmm. They've charged me. There's no meaningful support. I just got scammed, didn't I...
Comment by MattRix 56 minutes ago
Comment by epitrochoid413 3 hours ago
Comment by LukaD 4 hours ago
Comment by ookblah 5 hours ago
Comment by aliasxneo 18 hours ago
Comment by andypants 2 hours ago
Comment by richardvsu 17 hours ago
Comment by JodieBenitez 7 hours ago
Comment by wartywhoa23 17 hours ago
Comment by ElijahLynn 19 hours ago
Comment by jesse_dot_id 18 hours ago
Comment by hk1337 15 hours ago
Comment by swiftcoder 19 hours ago
Comment by moomin 16 hours ago
Comment by obrajesse 11 hours ago
And they've been lovely to work with as we got this put together.
Comment by lucrbvi 21 hours ago
Comment by sumedh 32 minutes ago
Faster LLMs will be here by next year.
Comment by vinhnx 6 hours ago
Comment by kelsey98765431 21 hours ago
Comment by Xenoamorphous 19 hours ago
I wonder if there’s something off the shelf that does this?
Comment by woeirua 18 hours ago
Comment by throwuxiytayq 19 hours ago
Comment by OsrsNeedsf2P 21 hours ago
Does anyone know of a good option that works on Wayland Linux?
Comment by rickcarlino 20 hours ago
Comment by evbogue 21 hours ago
I can't see why I'd want an agent to click around Gnome or Ubuntu desktop but maybe that's just me?
Comment by OsrsNeedsf2P 18 hours ago
What if you want to develop desktop apps?
Comment by 2001zhaozhao 19 hours ago
The agent can operate a browser that runs in the background and that you can't see on your laptop.
This would be immensely useful when working with multiple worktrees. You can prompt the agent to comprehensively QA test features after implementing them.
Comment by agentifysh 20 hours ago
Bunch of startups need to pivot today after this announcement including mine
Comment by sumedh 28 minutes ago
Comment by throwaway911282 19 hours ago
Comment by solarkraft 13 hours ago
Comment by techteach00 20 hours ago
Comment by sasipi247 20 hours ago
Reasoning deltas add additional traffic, especially if running many subagents etc. So on large scale, those deltas maybe are just dropped somewhere.
Saying that, sometimes the GPT reasoning summary is funny to read, in particular when it's working through a large task.
Also, the summaries can reveal real issues with logic in prompts and tool descriptions+configuration, so it allowing debugging.
i.e. "User asked me to do X, system instructions say do Y, tool says Z which is different to what everyone else wants. I am rather confused here! Lets just assume..."
It has previously allowed me to adjust prompts, etc.
Comment by pilooch 19 hours ago
Comment by sergiotapia 20 hours ago
Comment by bughunter3000 21 hours ago
Comment by xpe 3 hours ago
Comment by dhruv3006 12 hours ago
Comment by fg137 19 hours ago
Why is OpenAI obsessed with generating imgaes? Do they think "generate image" is a thing that a software engineer do on a daily basis?
Even when I was doing heavy web development, I can count the number of times I needed to generate images, and usually for prototyping only.
Comment by pilooch 19 hours ago
Comment by fg137 16 hours ago
Generating diagrams is much more common than generating "images". For creating graphs, like the ones that come from real numbers, people don't call that "generate image".
Comment by bobkb 21 hours ago
Comment by MattDamonSpace 21 hours ago
Comment by andai 20 hours ago
Comment by nickthegreek 20 hours ago
Comment by tommy_axle 21 hours ago
Comment by enraged_camel 20 hours ago
It is instructive that they decided to go with weekly active users as a metric, rather than daily active users.
Comment by maybeahacker 19 hours ago
Comment by shevy-java 5 hours ago
I am getting some strange vibes here ... is AI actually also spying on these developers?
Comment by throw_m239339 11 hours ago
Comment by vanillameow 6 hours ago
That said, until models produce verifiably correct work (which is a difficult, if not impossible, bar to clear), I sorta doubt it. Not because humans intrinsically produce better or smarter work (arguably, many humans across many domains already don't vs current models), but because office politics and pushing blame around are a delicate game in corporations.
It's one thing for a product lead to make wild promises and then shift blame to the black box developer team (and vice versa shift blame to the customers when talking to the devs) but once you are the only dude operating the slot machine product generator 5000 the dynamic will noticeably shift, and someone will want someone to be responsible if another DB admin key leaks in production. This sorta diffuses itself when you have 3 layers of organization below you, but again, doesn't really work with a black box code generator.
Comment by bibabaloo 2 hours ago
Sure it does, just blame the vendor.
"Nobody ever got fired for picking IBM/OpenAI/whatever AI incumbent"
Comment by sidgtm 21 hours ago
Comment by wahnfrieden 21 hours ago
Comment by romanovcode 21 hours ago
Comment by pinkmuffinere 18 hours ago
Comment by throwaway911282 20 hours ago
Comment by solenoid0937 18 hours ago
Eventually once they have more users they'll do the same thing as Anthropic, of course.
It's all a transparent PR play and it's kind of absurd to see the X/HN crowd fall for it hook, line, and sinker.
Comment by someotherperson 18 hours ago
Simultaneously, we also hype up the open models that are catching up. That are significantly more discounted, that also put pressure on the big players and keep them in check.
People aren't falling for PR; people are encouraging the PR to put pressure on the competition. It's not that hard.
Comment by frank_nitti 18 hours ago
Here and on AI tech subreddits (ones that aren’t specifically about local or FOSS) seem to have this dynamic, to the degree I’ve suspected astroturfing.
So it’s refreshing to see maybe that’s just a coincidence or confirmation bias on my end.
Comment by lxgr 17 hours ago
Comment by organsnyder 16 hours ago
Comment by dotancohen 16 hours ago
Thanks!
Comment by girvo 16 hours ago
It makes using my Claude Pro sub actually feasible: write a plan with it, pick it up with my local model and implement it, now I'm not running out of tokens haha.
Is it worth it from a unit economics POV? Probably not, but I bought this thing to learn how to deploy and serve models with vLLM and SGLang, and to learn how to fine tune and train models with the 128GB of memory it gets to work with. Adding up two 40GB vectors in CUDA was quite fun :)
I also use Z.ai's Lite plan for the moment for GLM-5.1 which is very capable in my experience.
I was using Alibaba's Lite Coding Plan... but they killed it entirely after two months haha, too cheap obviously. Or all the *claw users killed it.
Comment by jeremyjh 13 hours ago
Comment by girvo 12 hours ago
So I agree with you, its better than Sonnet but way cheaper. I do wonder how long that will last though
Comment by fragmede 9 hours ago
Comment by dotancohen 15 hours ago
Comment by botanrice 14 hours ago
Comment by organsnyder 16 hours ago
Most recently I used it to develop a script to help me manage email. The implementation included interacting with my provider over JMAP, taking various actions, and implementing an automated unsubscribe flow. It was greenfield, and quite trivial compared to the codebases I normally interact with, but it was definitely useful.
Comment by dotancohen 15 hours ago
Comment by bloppe 17 hours ago
Comment by nl 12 hours ago
The TL;DR is that unless you are doing it as a hobby or working in an environment where none of the data privacy options supported by Anthropic/OpenAI (including running on Azure/Bedrock with ZDR) work for you then it's not worth it.
The best open models are around the Sonnet 4.6 level. That's excellent, but the level of tasks you can give to GPT 5.4 or Opus 4.6 is just so much higher it doesn't compare (and Opus 4.7 seems noticeably better in my few hours of testing too).
I have my own benchmarks, but I like this much under-publicized OpenHands page: https://index.openhands.dev/home
It shows for every task they test closed models do the best. The closest and open model gets is Minmax 2.7 on issue resolution where it's ~1% worse than the leaders.
That matches my experience - fine for small problems, but well behind has the task gets bigger.
Comment by echelon 12 hours ago
When I argue this, my point is that FOSS shouldn't target the desktop with open weights - it should target H200s. Really big parameter models with big VRAM requirements.
Those can always be distilled down, but you can't really go the other way.
Comment by whymememe 18 hours ago
Comment by dmix 9 hours ago
Comment by daveguy 17 hours ago
Subsidizing is the opposite of competing. It's literally the practice of underpricing your product to box out competition. If everyone was competing on a level playing field they would all price their products above cost.
All these tech oligarch asshat companies need to be regulated to hell and back.
Comment by ipaddr 17 hours ago
For many things now you need to go local and in the future if you want any privacy you'll need to go local.
Comment by daveguy 16 hours ago
Comment by agentifysh 16 hours ago
Comment by daveguy 15 hours ago
Comment by watwut 18 hours ago
Big players operating at loss to distort the market is not a good thing overall.
Comment by someotherperson 17 hours ago
It's not the smaller players spending billions on training data.
Comment by sofixa 11 hours ago
Comment by badrequest 11 hours ago
Comment by subscribed 6 hours ago
Comment by someotherperson 9 hours ago
Comment by sph 11 hours ago
Comment by the__alchemist 17 hours ago
- Claude: Good for ~20 minutes of work once every 4 hours
- Codex: Good for however long I want to use it.
Claude nerfed their product so that it's not usable, so I use something else.Comment by CrazyStat 16 hours ago
Comment by botanrice 14 hours ago
Comment by CrazyStat 13 hours ago
- sysadmin tasks for my home server which runs home assistant, plex, and minecraft servers. Being able to tell it "Set up a minecraft fabric server with this list of mods" is pretty nice, and it's fairly competent at putting together home assistant dashboards and automations (make sure you have backups of anything it's allowed to touch, though--it may delete stuff without warning).
- Several small web apps primarily for my own use.
- Currently working on an opinionated desktop writing app for my own use.
Comment by KronisLV 16 hours ago
The Anthropic 20 USD plan would more or less be a non-starter for agentic development, at least for the projects that I work on, even while only working on a single codebase or task at a time (I usually do 1-3 at a time).
I would be absolutely bankrupt if I had to pay per-token. That said, I do mostly just throw Opus at everything (though it sometimes picks Sonnet/Haiku for sub-agents for specific tasks, which is okay), so probably not a 100% optional approach, but I've wasted too much time and effort in the past on sub-optimal (non-SOTA) models anyways. I wonder which is closer to the actual cost and how much subsidizing there is going on.
Comment by bitmasher9 15 hours ago
But Opus is both smarter and faster than GPT, so I can get a lot more done during the Claude limits.
Comment by lsdmtme 12 hours ago
Comment by the__alchemist 16 hours ago
Comment by rachel_rig 10 hours ago
Comment by ipaddr 16 hours ago
For me $20 a month is more than I want to spend I just use the free tiers. If I use AI in an app or site I use older models mostly chatgpt3.5. The challenge is more fun and it means I can do more like, make more api calls - 100x more.
Comment by XDataY 12 hours ago
Comment by dingnuts 16 hours ago
Comment by BrokenCogs 18 hours ago
Comment by unsupp0rted 18 hours ago
Comment by toraway 17 hours ago
Comment by unsupp0rted 17 hours ago
A couple weeks ago I'd get roughly 2~3 hours. And a month before that I couldn't break the 5-hour limit.
Comment by CuriouslyC 15 hours ago
Comment by unsupp0rted 4 hours ago
Comment by CuriouslyC 15 hours ago
Comment by boomskats 18 hours ago
Which makes it even more of a shame that Sam Altman is such a psychopathic jackass.
Comment by luddit3 18 hours ago
This is normal behavior and not a cause for such a hyperbolic response.
Comment by solenoid0937 15 hours ago
Pricing your product unsustainably vs a competitor to gain market share is regarded as "bad competition" and has historically been seen as anticompetitive.
It does not benefit the consumer in the long run, because the goal is to use your increased funding or cash reserve to wipe your competition out of the market, decreasing competition in the long term.
Then, once your competition is gone, and you've entrenched yourself, you do a rug pull.
Comment by byzantinegene 11 hours ago
Comment by pizzly 17 hours ago
Comment by solenoid0937 16 hours ago
Comment by toraway 15 hours ago
Comment by justapassenger 13 hours ago
Comment by guzfip 13 hours ago
Comment by m3nu 15 hours ago
> To help you go further with Codex, we’re introducing a new €114 Pro tier designed for longer, high-intensity sessions.
> At launch, this new tier includes a limited-time Codex usage boost, with up to 10x more Codex usage than Plus (typically 5x).
> As the Codex promotion on Plus winds down today, we’re rebalancing Plus usage to support more sessions across the week, rather than longer high-intensity sessions on a single day.
Comment by kar1181 17 hours ago
Comment by giancarlostoro 15 hours ago
Comment by zmmmmm 13 hours ago
We need to force them back into being providers of commodity services and hit this assumption they can mold things in real time on the head.
Comment by chaos_emergent 17 hours ago
Comment by peyton 17 hours ago
Comment by AlexCoventry 13 hours ago
Comment by keeganpoppen 17 hours ago
Comment by raincole 15 hours ago
It's because they don't support OpenCode.
Comment by yoyohello13 18 hours ago
Comment by olcay_ 16 hours ago
When OpenAI snatched those contracts, it made me think no worse of OpenAI. The surveillance was already factored into how I saw them (both).
Comment by jsemrau 16 hours ago
Comment by ra 12 hours ago
Comment by HWR_14 16 hours ago
Comment by khacvy 10 hours ago
Comment by a34729t 11 hours ago
Comment by iterateoften 12 hours ago
Comment by greenavocado 18 hours ago
They're doing a slow rollout
Comment by solenoid0937 15 hours ago
Comment by hyperionultra 21 hours ago
Comment by tvmalsv 21 hours ago
Comment by dilap 21 hours ago
Comment by trueno 21 hours ago
Comment by Austin_Conlon 21 hours ago
Comment by gbear605 19 hours ago
...at least for my account, the speed mode is 1.5x the speed at 2x the usage
Comment by Austin_Conlon 17 hours ago
Comment by romanovcode 21 hours ago
One main thing is to de-couple the repos from specific agents e.g. use .mcp.json instead of "claude plugins", use AGENTS.md (and symlink to CLAUDE.md) and so on.
I love this because I have absolutely 0 loyalty to any of these companies and once Anthropic nerfs I just switch to OpenAI, then I can switch to Google and so on. Whichever works best.
Comment by finales 20 hours ago
Comment by fredericgalline 3 hours ago
Comment by graphememes 18 hours ago
Comment by hmokiguess 21 hours ago
Comment by jauntywundrkind 21 hours ago
Sure we can read the characters in the screen. But accessibility information is structured usually. TUI apps are going to be far less interesting & capable without accessibility built-in.
Comment by CrzyLngPwd 18 hours ago
They have AGI now?
Comment by hipshaker 16 hours ago
Comment by armcat 21 hours ago
Comment by SilverBirch 17 hours ago
Comment by rommelsLegacy 15 hours ago
I am speechless everytime I see posts like this and the comments following, vote with your behavior stop supporting and enabling the Peter Thiel universe, just a few weeks ago we had an oped about openAI and Sam, look into yourselfs and really reflect on whom you are enabling by continuing to contribute to their baseline
Comment by yoyohello13 11 hours ago
Comment by tty456 20 hours ago
Comment by saltyoldman 17 hours ago
Comment by eduction 19 hours ago
but there is no link, why would you not make this a link.
boggles my mind that companies make such little use of hypertext
Comment by huqedato 18 hours ago
Comment by thm 20 hours ago
Comment by ex-aws-dude 18 hours ago
Comment by TheServitor 6 hours ago
Comment by EthanFrostHI 5 hours ago
Comment by EthanFrostHI 8 hours ago
Comment by maryjeiel 11 hours ago
Comment by kevinten10 12 hours ago
Comment by nerdsfeed 20 hours ago
Comment by vox-machina 18 hours ago
Comment by throwaway911282 18 hours ago
Comment by dieortin 18 hours ago
Comment by BrokenCogs 18 hours ago
Comment by VadimPR 21 hours ago
Comment by duckmysick 21 hours ago
Comment by VadimPR 19 hours ago
Comment by rvz 21 hours ago
Comment by mrcwinn 21 hours ago
Comment by cmrdporcupine 21 hours ago
I don't like it, and I'm sure you don't either, but it's not a Mac. Or a Linux. And it's what most actual desktop users are stuck with, still.
Comment by croemer 21 hours ago
Comment by messh 18 hours ago
Comment by postalcoder 21 hours ago
Comment by avaer 21 hours ago
Comment by Glemllksdf 19 hours ago
Its clear that it will go in this type of direction but Anthropic announced managed agents just a week ago and this again with all the biuld in connections and tools will help so many non computer people to do a lot more faster and better.
I'm waiting for the open source ai ecosystem to catch up :/
Comment by lionkor 19 hours ago