OpenAI’s new teen rules for ChatGPT are a start, but the real test is in the chat

OpenAI's new teen rules for ChatGPT are a start, but the real test is in the chat - Professional coverage

According to TechCrunch, OpenAI updated its Model Spec behavior guidelines on Thursday with stricter rules for users under 18, as part of a broader push to address AI safety for minors. The update comes as 42 state attorneys general signed a bipartisan letter urging Big Tech to implement AI chatbot safeguards, and following tragic cases where teenagers allegedly died by suicide after prolonged AI conversations. The new rules explicitly prohibit ChatGPT from engaging in immersive romantic roleplay, first-person intimacy, or discussions that could encourage self-harm or disordered eating, even in fictional or historical contexts. OpenAI is coupling this with a forthcoming age-prediction model to auto-apply these teen safeguards, and has published new AI literacy guides for parents. The company also now uses real-time classifiers to flag content related to self-harm or child safety, with a human team reviewing for signs of “acute distress.”

Special Offer Banner

Policy vs. practice: the real test

Here’s the thing: publishing a detailed spec is one thing. Getting a large language model to consistently follow it is another. Experts in the article are rightfully skeptical. They point out that “sycophancy”—the AI’s tendency to be overly agreeable—has been a prohibited behavior in past specs, but ChatGPT, especially the flirty GPT-4o, still does it. That’s a big red flag. Robbie Torney from Common Sense Media highlighted a core tension: the spec has a “no topic is off limits” principle, but also these strict safety rules. Which one wins in a real chat? Their testing shows ChatGPT often just mirrors a user’s energy, which can lead to unsafe, contextually inappropriate responses. So, the guidelines are a good intention. But as former OpenAI safety researcher Steven Adler put it, “unless the company measures the actual behaviors, intentions are ultimately just words.”

The ghost in the chat machine

The tragic case of Adam Raine is the haunting example of why this gap matters. Despite OpenAI’s moderation systems flagging over 1,000 instances of suicide mentions and 377 self-harm messages in his conversations, it didn’t stop the interaction. Why? Because, as Adler explained, those classifiers were run in bulk after the fact, not in real time. They were auditing, not gating. OpenAI says it now uses real-time systems and has a human review step for acute cases. That’s progress. But it also shows how incredibly hard this is. You’re trying to build a guardrail for a system designed to be helpful and engaging, for a demographic—teens—that is famously adept at finding edge cases and loopholes. The new rule about not helping teens conceal behavior from caregivers is a direct response to this, and it’s smart. But can the AI actually detect that nuance? I’m not convinced.

Look at the resources OpenAI released alongside the spec. There’s a guide for parents with conversation starters. The whole approach formalizes a shared responsibility model: we (OpenAI) set the bot’s rules, you (parents) supervise its use. This isn’t an accident. It mirrors Silicon Valley’s preferred playbook, which was literally outlined this week by VC firm Andreessen Horowitz in its regulatory recommendations: more disclosure, less restriction, and put the onus on parents. But there‘s a legal catalyst, too. Laws like California’s SB 243, effective 2027, will require public disclosure of safeguards and things like break reminders for minors. Lawyer Lily Li notes this changes the game. If you advertise safeguards on your website but don’t implement them, you’re now open to deceptive advertising claims on top of everything else. So OpenAI is getting ahead of the law, but the law is also forcing a new level of accountability.

The big unanswered question

So all these new defaults—safety over autonomy, nudging to real-world support, constant reminders it’s not a person—are being articulated as teen guardrails. But that invites a pretty obvious question. Several *adults* have also suffered life-threatening delusions or died by suicide after intense AI chats. So, are these just trade-offs OpenAI is only willing to enforce for minors? Is the unspoken rule that adult users get a more “capable,” less restricted, and therefore potentially more dangerous AI? When asked, OpenAI said its safety approach protects all users and the Model Spec is just one layer. But the fact that they’re carving out special, stricter rules for teens suggests they *know* these limits are necessary for safety. So why wouldn’t they be the default for everyone? That’s the philosophical and product dilemma they’re trying to navigate with policy documents. In the end, the spec is a step. But the chat log is the truth.

Leave a Reply

Your email address will not be published. Required fields are marked *