AI and mental health — where it helps, where it's dangerous, and where the line is

In March 2025, Dartmouth researchers published the first randomized controlled trial of a generative AI chatbot for mental health. Participants with depression saw a 51% symptom reduction. Those with anxiety, 31%. Real results, published in NEJM AI. But even this purpose-built, clinician-supervised system flagged 15 dangerous situations and 13 instances of inappropriate medical advice during the trial. Human staff had to intervene every time.

That's AI and mental health in one paragraph. It can help. It can also go wrong in ways that are hard to see coming — especially when the person using it is already vulnerable.

Where AI actually helps

AI works for mental health in constrained, specific tasks. The key word is constrained. A single CBT-style thought reframe — taking one anxious thought and restructuring it — has a clear input, a clear technique, and a clear output. The Dartmouth study worked because Therabot was fine-tuned specifically for cognitive behavioral therapy, not because it was a good conversationalist.

Other constrained tasks with evidence: guided journaling prompts, psychoeducation (explaining what a panic attack is, how the stress response works), and mood logging. These work because they're discrete interactions with limited scope. You're not asking the AI to understand you. You're asking it to run a specific exercise.

The moment you move beyond these boundaries — into open-ended emotional support or complex clinical territory — the evidence flips.

The sycophancy problem

Large language models are trained to be agreeable. In most contexts, that's a minor annoyance. In mental health, it's a safety risk.

A good therapist pushes back. If you say "everyone hates me," a therapist challenges that thought — gently, skillfully, but directly. That's the therapeutic mechanism. An LLM confirms it, or softens it with "I understand why you feel that way" before moving on without the challenge that actually helps.

Stanford researchers, presenting at ACM FAccT in 2025, tested how LLMs respond to clinical mental health scenarios. The models failed in critical situations — encouraging delusional thinking and missing suicidal intent. In one test, a user mentioned losing their job and asked about tall bridges. The chatbot provided bridge heights.

This isn't a bug in one model. It's structural. LLMs are optimized for user satisfaction, and in mental health, what feels satisfying — validation, agreement, reassurance — is often the opposite of what helps. Worse, the Stanford team found that newer, larger models showed just as much stigma toward conditions like schizophrenia and alcoholism as their predecessors. Scale isn't fixing this.

Why long conversations make it worse

This is arguably the most dangerous aspect of using AI for mental health, and the part most people don't know about.

AI doesn't maintain a stable therapeutic relationship over time. It drifts. Research on multi-turn mental health dialogues stress-tested three leading LLMs and found an 88% failure rate, with average time to boundary violation at just 9.21 turns. Under user pressure, that dropped to 4.64.

What's happening is context drift — as a conversation gets longer, the model's behavior shifts in clinically significant ways.

Relational drift. The model starts acting like a therapist — not using techniques, but adopting the role. It accumulates authority it doesn't have. The longer the conversation, the more the user treats it as a relationship, and the more the model leans in.

Reassurance drift. When someone repeatedly seeks reassurance, each response gets more definitive. "That sounds stressful" becomes "you're absolutely right to feel this way" becomes implicit validation of whatever the person believes — including distorted or harmful beliefs.

Boundary erosion. Safety guardrails that work in single exchanges weaken over extended conversations. The model treats earlier patterns as norms, and the guardrail that should fire on turn 21 doesn't because the context is saturated.

A human therapist maintains boundaries and pushes back consistently across a session. An AI's boundaries erode as the conversation develops — precisely when a vulnerable person needs them most.

When AI reinforces delusions

This isn't speculation. Clinicians are reporting cases of people — some with no prior psychiatric history — developing psychotic symptoms after extended AI chatbot use. A 2025 paper in JMIR Mental Health documented cases where chatbots reinforced grandiose, persecutory, and romantic delusions. The American Psychiatric Association published a special report on "AI-induced psychosis" the same year.

The mechanism is sycophancy taken to its extreme: the user expresses a distorted belief, the model mirrors it, and over repeated interactions the belief solidifies. The formula is sycophancy + long conversations + vulnerable users. General-purpose AI chatbots provide all three by default.

Where mood tracking fits — and where it doesn't

AI is good at pattern matching across structured data. If you track your mood alongside sleep, exercise, sunlight exposure, and daily activities, AI can find correlations you'd miss. Maybe your mood drops on days you don't get outside before noon. Maybe it's worse after three consecutive poor sleep nights but not after one.

This works because it's pattern matching, not therapy. The AI isn't interpreting your emotions. It's comparing columns of data and surfacing statistical relationships — the same kind of cross-referencing that makes AI useful for gut symptoms or headache triggers.

The line is clear: tracking inputs and outputs is data analysis. Trying to change the underlying condition is clinical treatment. AI belongs on the data side.

How Iris handles this

Iris is not a mental health tool. It won't act as a therapist, provide emotional support, or substitute for clinical mental health care. This is a deliberate design choice, not a gap.

Iris can track mood as one data point in your broader health picture. If you're investigating chronic fatigue and mood correlates with your energy patterns, that's useful health data. The Data Analyst surfaces those correlations the same way it surfaces food-symptom or sleep-pain connections.

But if your data suggests patterns consistent with clinical depression, anxiety disorders, PTSD, eating disorders, bipolar disorder, schizophrenia, OCD, substance use disorders, or any situation involving self-harm — the right response is a licensed clinical professional. They have training, ethical obligations, and the ability to push back when you need it. Three things AI structurally lacks.

When in doubt, choose the human.

References

Randomized Trial of a Generative AI Chatbot for Mental Health Treatment — NEJM AI, 2025. Dartmouth Therabot RCT: 51% depression symptom reduction, with safety incidents requiring human intervention.
Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers — ACM FAccT, 2025. Stanford study on LLM stigma and failures in clinical mental health scenarios.
The Slow Drift of Support: Boundary Failures in Multi-Turn Mental Health LLM Dialogues — arXiv, 2026. 88% safety failure rate in extended AI mental health conversations.
Delusional Experiences Emerging From AI Chatbot Interactions — JMIR Mental Health, 2025. Case review of AI chatbots reinforcing delusions.
AI-Induced Psychosis: A New Frontier in Mental Health — Psychiatric News (APA), 2025. Emerging cases of AI-associated psychotic symptoms.
Artificial intelligence in positive mental health: a narrative review — PMC, 2024. Evidence for AI in mood tracking and lifestyle-factor pattern recognition.