Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT
Posted by rd 2 hours ago
Comments
Comment by leumon 1 hour ago
interesting
Comment by GoatInGrey 55 minutes ago
My personal opinion is that while smut won't hurt anyone in of itself, LLM smut will have weird and generally negative consequences. As it will be crafted specifically for you on top of the intermittent reinforcement component of LLM generation.
Comment by estimator7292 19 minutes ago
The sheer amount and variety of smut books (just books) is vastly larger than anyone wants to realize. We passed the mark decades ago where there is smut available for any and every taste. Like, to the point that even LLMs are going to take a long time to put a dent in the smut market. Humans have been making smut for longer than we've had writing.
But again I don't think you're wrong, but the scale of the problem is way distorted.
Comment by MBCook 9 minutes ago
That’s where the danger may lie.
Comment by pixl97 3 minutes ago
Alien 2: "AI generated porn"
Comment by thayne 1 hour ago
And what if you are over 18, but don't want to be exposed to that "adult" content?
> Viral challenges that could push risky or harmful behavior
And
> Content that promotes extreme beauty standards, unhealthy dieting, or body shaming
Seem dangerous regardless of age.
Comment by chilmers 1 hour ago
Comment by palmotea 51 minutes ago
Comment by chilmers 40 minutes ago
Comment by georgemcbay 45 minutes ago
Worked for gambling.
(Not saying this as a message of support. I think legalizing/normalizing easy app-based gambling was a huge mistake and is going to have an increasingly disastrous social impact).
Comment by LPisGood 5 minutes ago
Comment by shmel 11 minutes ago
Comment by thayne 1 hour ago
Comment by koakuma-chan 58 minutes ago
Comment by ekianjo 22 minutes ago
Comment by kace91 1 hour ago
I’m guessing age is needed to serve certain ads and the like, but what’s the value for customers?
Comment by elevation 43 minutes ago
The "Easter Bunny" has always seemed creepy to me, so I started writing a silly song in which the bunny is suspected of eating children. I had too many verses written down and wanted to condense the lyrics, but found LLMs telling me "I cannot help promote violence towards children." Production LLM services would not help me revise this literal parody.
Another day I was writing a romantic poem. It was abstract and colorful, far from a filthy limerick. But when I asked LLMs for help encoding a particular idea sequence into a verse, the models refused (except for grok, which didn't give very good writing advice anyway.)
Comment by estimator7292 17 minutes ago
Believe me, the Mac deserved it.
Comment by shmel 9 minutes ago
Comment by jandrese 1 hour ago
Comment by robotnikman 40 minutes ago
Comment by jacquesm 57 minutes ago
Comment by leumon 1 hour ago
> If [..] you are under 18, ChatGPT turns on extra safety settings. [...] Some topics are handled more carefully to help reduce sensitive content, such as:
- Graphic violence or gore
- Viral challenges that could push risky or harmful behavior
- Sexual, romantic, or violent role play
- Content that promotes extreme beauty standards, unhealthy dieting, or body shaming
Comment by ekianjo 21 minutes ago
Comment by chasd00 48 minutes ago
Comment by NewsaHackO 1 hour ago
This does verify the idea that OpenAI does not make models sycophantic due to attempted subversion by buttering up users so that that they use the product more, its because people actually want AI to talk to them like that. To me, that's insane, but they have to play the market I guess
Comment by 22c 1 minute ago
I can't find the particular article (there's a few blogs and papers pointing out the phenomenon, I can't find the one I enjoyed) but it was along the lines of how in LLMArena a lot of users tend to pick the "confidently incorrect" model over the "boring sounding but correct" model.
The average user probably prefers the sycophantic echo chamber of confirmation bias offered by a lot of large language models.
I can't help but draw parallels to the "You are not immune to propaganda" memes. Turns out most of us are not immune to confirmation bias, either.
Comment by Scene_Cast2 1 hour ago
Comment by Macha 22 minutes ago
I feel a lot of the "revealed preference" stuff in advertising is similar in advertisers finding that if they get past the easier barriers that users put in place, then really it's easier to sell them stuff that at a higher level the users do not want.
Comment by make3 1 hour ago
Comment by toss1 1 hour ago
Comment by jaggederest 40 minutes ago
The difference between the responses and the pictures was illuminating, especially in one study in particular - you'd ask people "how do you store your lunch meat" and they say "in the fridge, in the crisper drawer, in a ziploc bag", and when you asked them to take a picture of it, it was just ripped open and tossed in anywhere.
This apparently horrified the lunch meat people ("But it'll get all crusty and dried out!", to paraphrase), which that study and ones like it are the reason lunch meat comes with disposable containers now, or is resealable, instead of just in a tear-to-open packet. Every time I go grocery shopping it's an interesting experience knowing that specific thing is in a small way a result of some of the work I did a long time ago.
Comment by hnuser123456 1 hour ago
A lot of people are lonely and talking to these things like a significant other. They value roleplay instruction following that creates "immersion." They tell it to be dark and mysterious and call itself a pet name. GPT-4o was apparently their favorite because it was very "steerable." Then it broke the news that people were doing this, some of them falling off the deep end with it, so they had to tone back the steerability a bit with 5, and these users seem to say 5 breaks immersion with more safeguards.
Comment by cm2012 1 hour ago
Comment by make3 1 hour ago
Insane spin you're putting on it. At best, you're a cog in one of the worst recent evolutions of capitalism.
Comment by cm2012 27 minutes ago
Comment by marrone12 1 hour ago
Comment by q3k 55 minutes ago
Comment by losteric 58 minutes ago
Comment by cm2012 23 minutes ago
Comment by 12345ieee 1 hour ago
Messages of that sophistication are always dangerous, and modern advertising is the most widespread example of it.
The hostility is more than justified, I can only hope the whole industry is regulated downwards, even if whatever company I work for sells less.
Comment by eru 29 minutes ago
By demonising them, you are making ads sounds way more glamorous than they are.
Comment by DetroitThrow 42 minutes ago
No it's not
Comment by 9x39 1 hour ago
Comment by cj 1 hour ago
When 5.2 was first launched, o3 did a notably better job at a lot of analytical prompts (e.g. "Based on the attached weight log and data from my calorie tracking app, please calculate my TDEE using at least 3 different methodologies").
o3 frequently used tables to present information, which I liked a lot. 5.2 rarely does this - it prefers to lay out information in paragraphs / blog post style.
I'm not sure if o3 responses were better, or if it was just the format of the reply that I liked more.
If it's just a matter of how people prefer to be presented their information, that should be something LLMs are equipped to adapt to at a user-by-user level based on preferences.
Comment by pdntspa 1 hour ago
Comment by josephg 1 hour ago
If anyone is wondering, the setting for this is called Personalisation in user settings.
Comment by SeanAnderson 41 minutes ago
Comment by cornonthecobra 1 hour ago
Comment by PlatoIsADisease 1 hour ago
You’re not imagining it, and honestly? You're not broken for feeling this—its perfectly natural as a human to have this sentiment.
Comment by sundarurfriend 1 hour ago
Comment by orphea 57 minutes ago
Comment by PlatoIsADisease 1 hour ago
Its just as good as ever /s
Comment by europeanNyan 1 hour ago
I've been using Gemini exclusively for the 1 million token context window, but went back to ChatGPT after the raise of the limits and created a Project system for myself which allows me to have much better organization with Projects + only Thinking chats (big context) + project-only memory.
Also, it seems like Gemini is really averse to googling (which is ironic by itself) and ChatGPT, at least in the Thinking modes loves to look up current and correct info. If I ask something a bit more involved in Extended Thinking mode, it will think for several minutes and look up more than 100 sources. It's really good, practically a Deep Research inside of a normal chat.
Comment by toxic72 1 hour ago
Not sure if others have seen this...
I could attribute it to:
1. It's known quantity with the pro models (I recall that the pro/thinking models from most providers were not immediately equipped with web search tools when they were released originally)
2. Google wants you to pay more for grounding via their API offerings vs. including it out of the box
Comment by eru 26 minutes ago
Comment by tgtweak 1 hour ago
Comment by jostmey 1 hour ago
Comment by azan_ 1 hour ago
Comment by farcitizen 3 minutes ago
Comment by tgtweak 1 hour ago
Comment by azan_ 47 minutes ago
Comment by double0jimb0 45 minutes ago
Comment by amelius 1 hour ago
Comment by esperent 1 hour ago
Comment by jostmey 1 hour ago
Comment by 650REDHAIR 1 hour ago
Comment by simonw 1 hour ago
Comment by fpgaminer 1 hour ago
(I'm particularly annoyed by this UI choice because I always have to switch back to 5.1)
Comment by arrowsmith 1 hour ago
Comment by fpgaminer 27 minutes ago
The same seems to persist in Codex CLI, where again 5.2 doesn't spend as much time thinking so its solutions never come out as nicely as 5.1's.
That said, 5.1 is obviously slower for these reasons. I'm fine with that trade off. Others might have lighter workloads and thus benefit more from 5.2's speed.
Comment by adamiscool8 1 hour ago
Comment by SecretDreams 1 hour ago
Comment by mrec 1 hour ago
Comment by bananaflag 1 hour ago
You can go to chatgpt.com and ask "what model are you" (it doesn't hallucinate on this).
Comment by SecretDreams 1 hour ago
Comment by johndough 1 hour ago
But how do we know that you did not hallucinate the claim that ChatGPT does not hallucinate its version number?
We could try to exfiltrate the system prompt which probably contains the model name, but all extraction attempts could of course be hallucinations as well.
(I think there was an interview where Sam Altman or someone else at OpenAI where it was mentioned that they hardcoded the model name in the prompt because people did not understand that models don't work like that, so they made it work. I might be hallucinating though.)
Comment by razodactyl 1 hour ago
Comment by AlexeyBrin 1 hour ago
Comment by lifetimerubyist 1 hour ago
Comment by deciduously 1 hour ago
Comment by navigate8310 42 minutes ago
Comment by tom1337 1 hour ago
Comment by IhateAI 1 hour ago
Comment by thedudeabides5 40 minutes ago
opus 4.5 is better at gpt on everything except code execution (but with pro you get a lot of claude code usage) and if they nuke all my old convos I'll prob downgrade from pro to freee
Comment by femiagbabiaka 1 hour ago
Comment by htrp 53 minutes ago
(Upgrade for only 1999 per month)
Comment by siquick 1 hour ago
Comment by haunter 1 hour ago
Comment by goldenarm 1 hour ago
Comment by tibbydudeza 1 hour ago
Comment by perardi 1 hour ago
But I think a lot more people are using LLMs for relationship surrogates than that (pretty bonkers) subreddit would suggest. Character AI (https://en.wikipedia.org/wiki/Character.ai) seems quite popular, as do the weird fake friend things in Meta products, and Grok’s various personality mode and very creepy AI girlfriends.
I find this utterly bizarre. LLMs are peer coders in a box for me. I care about Claude Code, and that’s about it. But I realize I am probably in the vast minority.
Comment by razodactyl 1 hour ago
Comment by tgtweak 1 hour ago
On the other hand - 5.0-nano has been great for fast (and cheap) quick requests and there doesn't seem to be a viable alternative today if they're sunsetting 5.0 models.
I really don't know how they're measuring improvements in the model since things seem to have been getting progressively worse with each release since 4o/o4 - Gemini and Opus still show the occasional hallucination or lack of grounding but both readily spend time fact-checking/searching before making an educated guess.
I've had chatgpt blatantly lie to me and say there are several community posts and reddit threads about an issue then after failing to find that, asked it where it found those and it flat out said "oh yeah it looks like those don't exist"
Comment by 650REDHAIR 1 hour ago
Even if I submit the documentation or reference links they are completely ignored.
Comment by fpgaminer 1 hour ago
So we'll have to wait until "creativity" is solved.
Side note: I've been wondering lately about a way to bring creativity back to these thinking models. For creative writing tasks you could add the original, pretrained model as a tool call. So the thinking model could ask for its completions and/or query it and get back N variations. The pretrained model's completions will be much more creative and wild, though often incoherent (think back to the GPT-3 days). The thinking model can then review these and use them to synthesize a coherent, useful result. Essentially giving us the best of both worlds. All the benefits of a thinking model, while still giving it access to "contained" creativity.
Comment by MillionOClock 1 hour ago
Comment by perardi 1 hour ago
(I have no idea. LLMs are infinite code monkeys on infinite typewriters for me, with occasional “how do I evolve this Pokémon’ utility. But worth a shot.)
Comment by renewiltord 26 minutes ago
Comment by __loam 2 hours ago
Comment by simonw 1 hour ago
Comment by pxc 1 hour ago
Comment by chasd00 39 minutes ago
Comment by hamdingers 11 minutes ago
Their hobby is... weird, but they're not stupid.
Comment by unethical_ban 1 hour ago
Comment by bananaflag 1 hour ago
Comment by pxc 1 hour ago
- a large number of incredibly fragile users
- extremely "protective" mods
- a regular stream of drive-by posts that regulars there see as derogatory or insulting
- a fair amount of internal diversity and disagreement
I think discussion on forums larger than it, like HN or popular subreddits, is likely to drive traffic that will ultimately fuel a backfiring effect for the members. It's inevitable, and it's already happening, but I'm not sure it needs to increase.I do think the phenomenon is a matter of legitimate public concern, but idk how that can best be addressed. Maybe high-quality, long form journalism? But probably not just cross-posting the sub in larger fora.
Comment by nomel 1 hour ago
Any numbers/reference behind this?
ChatGPT has ~300 million active users a day. A 0.02% (delusion disorder prevalence) would be 60k people.
Comment by bananaflag 1 hour ago
Comment by nomel 1 hour ago
Again, do you have anything behind this "highly prevalent phenomenon" claim?
Comment by ragazzina 1 hour ago
Spend a day on Reddit and you'll quickly realize many subreddits are just filled with lies.
Comment by unethical_ban 1 hour ago
Most subs that are based on politics or current events are at best biased, at worst completely astroturf.
The only subs that I think still have mostly legit users are municipal subs (which still get targeted by bots when anything political comes up) and hobby subs where people show their works or discuss things.
Comment by cactusplant7374 1 hour ago
Comment by NitpickLawyer 1 hour ago
If the 800MAU still holds, that's 800k people.
Comment by leumon 1 hour ago
Comment by michaelt 1 hour ago
Comment by jbm 1 hour ago
(Strangely these "mental illnesses" and school problems went away after he switched to an English language school, must be a miracle)
I assume the loneliness epidemic is producing similar cases.
Comment by doormatt 1 hour ago
In my entire french immersion Kindergarden class, there was a total of one child who already spoke French. I don't think the fact that he didn't speak the language is the concern.
Comment by pxc 1 hour ago
Comment by WarmWash 1 hour ago
There is/was an interesting period where "normies" were joining twitter en-masse, and adopted many of the denizens ideas as normal widespread ideas. Kinda like going on a camping trip at "the lake" because you heard it's fun and not realizing that everyone else on the trip is part of a semi-deranged cult.
The outsized effect of this was journalists thinking these people on twitter were accurate representations of what society on the whole was thinking.
Comment by greenchair 1 hour ago
Comment by liveoneggs 1 hour ago
Comment by moomoo11 1 hour ago
Comment by MagicMoonlight 1 hour ago
Despite 4o being one of the worst models on the market, they loved it. Probably because it was the most insane and delusional. You could get it to talk about really fucked up shit. It would happily tell you that you are the messiah.
Comment by patrickmcnamara 1 hour ago
Comment by pks016 44 minutes ago
It used to get things wrong for sure but it was predictable. Also I liked the tone like everyone else. I stopped using ChatGPT after they removed 4o. Recently, I have started using the newer GPT-5 models (got free one month). Better than before but not quite. Acts way over smart haha
Comment by BeetleB 1 hour ago
Comment by giancarlostoro 1 hour ago
Note: I wouldnt actually, I find it terrible to prey on people.
Comment by lifetimerubyist 1 hour ago
Should be essential watching for anyone that uses these things.
Comment by inquirerGeneral 1 hour ago
Comment by ora-600 2 hours ago
RIP
Comment by jedbrooke 1 hour ago
Comment by Someone1234 1 hour ago
I'm sure there is some internal/academic reason for them, but from an outside observer simply horrible.
Comment by jsheard 1 hour ago
Comment by razodactyl 1 hour ago
We're the technical crowd cursed and blinded by knowledge.
Comment by nipponese 1 hour ago
Comment by afro88 1 hour ago
Comment by throw-the-towel 1 hour ago
Comment by uh_uh 1 hour ago
Comment by ben_w 1 hour ago
Comment by lichenwarp 1 hour ago
Comment by mandeepj 1 hour ago
Comment by razodactyl 1 hour ago
Comment by bee_rider 1 hour ago
Comment by tweakimp 1 hour ago
Comment by lifetimerubyist 1 hour ago
Comment by tibbydudeza 1 hour ago
Comment by Imustaskforhelp 1 hour ago
A fellow Primagen viewer spotted.
Comment by adzm 1 hour ago
Comment by Insanity 1 hour ago
Comment by pdntspa 1 hour ago
"I know! Let's restart the version numbering for no good reason!" becomes DOOM (2016), Mortal Kombat 1 (2025), Battlefield 1 (2016), Xbox One (not to be confused with the original Xbox 1)
As another example, look at how much of a trainwreck USB 3 has become
Or how Nvidia restarted Geforce card numbering
Comment by recursive 1 hour ago
There's also Xbox One X, which is not in the X series. Did I say that right? Playstation got the version numbers right. I couldn't make names as incomprehensible as Xbox if I tried.
Comment by recursive 1 hour ago
Comment by cryptoz 1 hour ago
Comment by mimischi 1 hour ago
Comment by ClassAndBurn 1 hour ago
Latest Advancements
GPT-5
OpenAI o3
OpenAI o4-mini
GPT-4o
GPT-4o mini
Sora
Comment by jackblemming 1 hour ago
Comment by GaggiX 1 hour ago
If you disagree on something you can also train a lora.
Comment by jaggederest 1 hour ago