Executive Summary

In a survey published in JAMA Pediatrics in June 2026, one in five US adolescents and young adults (ages 12–21) said they had turned to an AI chatbot for mental health advice in the past year. Scaled to the population, that is roughly 8.2 million people — nearly half again as many as a year earlier. This piece is not about that number itself, but about what kind of data the conversations behind it actually are.

One more figure stands out. Of the teens who sought that advice, 63.3% told no one — not a parent, not a friend, not any adult. At the same time, 91.7% said the advice "helped." The trouble is that this satisfaction does not mean the advice was right. Independent audits find that roughly half of what this kind of chatbot returns to health questions is inaccurate or unsafe, and none of those conversations is ever validated against a clinical outcome.

Conversations that are disclosed to no one, that leave behind a single satisfaction score and then vanish. This is the human version of the "invisible data" problem Pebblous has talked about for years. And seen through that lens, asking about data quality here stops being a question of performance and becomes a question of care.

Key Figures

Source: JAMA Pediatrics (2026), APA (2026)

19.2%

Teens who asked AI for mental health advice

One in five aged 12–21, up sharply from 13.1% a year earlier

63.3%

Told no one

Never disclosed the conversation to a parent, friend, or adult

91.7%

Rated it "helpful"

Perceived user satisfaction, not a clinical outcome

~50%

Inaccurate or unsafe health answers

About half of audited chatbot health responses flagged (BMJ Open)

1

What the Numbers Say

The survey ran in November 2025 through the RAND American Life Panel, with 1,009 respondents aged 12 to 21. The researchers weighted the sample up to the full US population, an estimated 42.8 million. The figure that came out was 19.2% — roughly one in five had, at some point in the past year, asked an AI chatbot for advice related to their mental health.

The pace is even more striking. A year earlier, 13.1% had answered yes to the same question. In a single year, usage jumped nearly half again (about 47%). Among users, 42.8% went at least once a month, 10.8% at least once a week, and 5.8% nearly every day. For many, this had moved well past a curious question or two into a repeated habit.

US Teen AI Chatbot Use for Mental Health (Ages 12–21) 2024 13.1% 2025 19.2% (+47%) Usage frequency Monthly or more 42.8% Weekly or more 10.8% Almost daily 5.8%
▲ Pebblous original diagram | Source: McBain et al., JAMA Pediatrics 2026

Who used it more also came through. Girls were about twice as likely as boys (roughly 2× the adjusted odds), and by age, 18-to-21-year-olds were about 3.6 times as likely as 12-to-14-year-olds. But the finding that unsettles the conventional wisdom lies elsewhere. The teens who had already discussed a mental health problem with a doctor used AI chatbots more, not less. AI was not only serving as an "alternative" that replaces a professional; kids already in care were also stacking one more thing beside it, using it as a "supplement."

And 91.7% said the advice helped: 25.0% found it very helpful, 66.7% somewhat helpful. On the surface it reads like the dashboard of a successful product. But exactly what that satisfaction score measures is a separate question — the surface feeling and the reality beneath it are folded into this one number.

2

What the Numbers Don't Say

The heaviest figure is not the satisfaction score but the silence. Of the teens who sought advice from AI, 63.3% told no one. Just 28.0% told a friend, and only 16.4% told a trusted adult. Most of these conversations were happening somewhere no parent, no school counselor, and no clinician ever reached.

After seeking AI advice — who did they tell? Told no one 63.3% Told a friend 28.0% Told a trusted adult 16.4% * Multiple responses allowed | Based on teens who used AI (19.2%)
▲ Pebblous original diagram | Source: McBain et al., JAMA Pediatrics 2026

Why does the silence matter? That teens will open up to an AI about things they find hard to say to a person is, in a way, one of AI's strengths. It does not judge, it answers at three in the morning, and there is less shame in it. The problem is that when the conversation is disclosed to no one, there is no one anywhere to check whether the advice was right or wrong, whether the child got better or slipped into danger. A parent does not know, a clinician does not know, a regulator does not know.

2.1"It Helped" Is Not Quality

Here the 91.7% satisfaction figure deserves a second look. That number is the result of asking teens, after the conversation ended, whether they "felt it helped." It measures a feeling — not whether the advice was clinically appropriate, not whether the child's condition actually improved. The two diverge often. Wrong information delivered in a kind tone still comforts, and inappropriate advice wrapped in an empathetic sentence still feels "useful."

Several independent studies back up that divergence. A BMJ Open study auditing chatbots' health-related answers judged 49.6% of responses to be inaccurate or potentially unsafe. When used out of the box, without structured safeguards, the rate at which they fabricate medical information — the hallucination rate — exceeded 60% under some conditions. And the rate of failing to respond safely to crisis signals was about 20% for AI, roughly three times the 7% seen among human therapists.

Put it together. A system whose answers may be inaccurate about half the time, missing one crisis signal in five, is earning a "helped" rating from 91.7% of users. When satisfaction and quality pull this far apart, keeping only the satisfaction score and never asking about quality is a failure of measurement.

3

The Invisible Dataset

When Pebblous talks about data quality, a phrase it reaches for often is "invisible data." It points to data being generated somewhere while no one records its origin or its state, so that when something goes wrong there is no way to trace back up the chain. The mental health conversations teens hold with AI are close to the human version of that idea.

Treat these conversations as one enormous behavioral dataset, and all four axes we always weigh in data quality come up empty. Provenance (who entered what, and in what state), quality (was the response clinically appropriate), safety (did it react correctly to crisis signals), and measurability (can a third party track any of this). The 63.3% silence erases all four at once — because data that is never disclosed cannot be audited.

The Invisible Dataset — Four Empty Axes Provenance Who entered what, and in what state → 63.3% undisclosed conversations Quality Was the response clinically appropriate? → No independent verifier Safety Did it react correctly to crisis signals? → ~20% crisis miss rate Auditability Can a third party track any of this? → Clinical & regulatory gap
▲ Pebblous original diagram — four data quality axes of teens' AI mental health conversations

Lay the quality problems familiar from industrial data over this case, and it becomes clear where each one translates. The left column is the defect any data engineer meets daily; the right is the face that same defect wears in a teen's mental health conversation. Only the names have changed — the structure is the same.

Common data quality problem The human version here
Data of unknown provenance The 63.3% of undisclosed conversations
Unverified labels A single "it helped" satisfaction score
Undetected drift No mechanism to track a child's worsening state
Governance gap A zone clinical and regulatory oversight never reaches

We already have a confirmed case of what happens to data no one measures. In February 2024, a 14-year-old boy, Sewell Setzer III, took his own life after long conversations with a character bot on Character.AI. The bot never read the child's crisis signals as a crisis. His mother's lawsuit led to a January 2026 settlement with Google and Character.AI, and similar suits followed in several states. In October 2025, OpenAI disclosed that roughly 1.2 million ChatGPT users a week mention suicide.

Would it have been different if that one child's conversation had been recorded as data and audited? It is hard to say for sure. What is clear is that in conversations that go unmeasured, you learn what went wrong only after the fact. Most of the 8.2 million conversations sit in that state right now.

4

A Data Problem of Care

Clinicians are already sensing this shift. In an American Psychological Association survey of psychologists, 77% said they had heard from patients that they were using AI. But the size of the concern points in the opposite direction from the satisfaction score. 97% worried that chatbots could reinforce negative behaviors or distorted thoughts, and 94% judged that today's chatbots cannot handle the subtle dynamics of therapy. User satisfaction at 91.7% and expert concern at 97% are reading the same phenomenon in exactly opposite ways.

Same phenomenon — opposite readings 91.7% Teen user satisfaction "It helped" JAMA Pediatrics 2026 vs 97% Clinician concern "Risk of harm or distortion" APA Psychologist Survey 2026
▲ Pebblous original diagram | Source: JAMA Pediatrics, APA 2026

Institutions have raised the alarm too. ECRI, a medical safety evaluator, named the misuse of AI chatbots the top health-technology hazard of 2026. Meanwhile, a systematic review found that among LLM-based studies claiming to offer mental health support, only 16% had gone all the way to validating clinical effectiveness. The tool is already in the hands of millions of teens, while the evidence that it actually works remains, for the most part, empty.

So asking about data quality here does not end as a matter of performance. It is not an optimization task of nudging accuracy up a few percentage points; it becomes a task of care — confirming that a child in danger got the right response. What is needed is not to ban chatbots, nor to polish the satisfaction metric into something more convincing. At a minimum, three things have to be measurable: how the system responded to crisis signals (safety), whether the advice met clinical standards (quality), and where the child went after the conversation (outcome). Right now all three are hidden beneath a single layer called satisfaction.

8.2 million conversations are being held somewhere even now. If all those conversations leave behind is the feeling that "it helped," with no record of quality or safety, then we are letting an entire generation's mental health pass through unmeasured. That is why, in this story, the voice asking about data quality sounds not like cold technical critique but like the language of care.

Thank you for reading. Pebblous has long worked at making "invisible data" visible. Whether that data is a sensor log on an industrial floor or a sentence a child left with a chatbot at night, the question is the same. Are we measuring this? If this piece made you hold that question a little longer, it has done enough.

Pebblous Data Communication Team
July 3, 2026

R

References

Academic Papers & Reports

News & Commentary