SIGNALAI·Jun 2, 2026, 4:00 AMSignal75Short term

Why Do Self-Harm Prediction Models Struggle to Generalise? Lexical and Semantic Variations in Emergency Department Triage Notes

arXiv:2606.01678v1 Announce Type: new Abstract: Self-harm presentations to emergency departments (EDs) are strongly associated with higher suicide risk. NLP models have shown robust performance in detecting self-harm from triage notes within single hospitals, yet performance often declines across institutions. To examine potential causes, we compare ED triage notes from two hospitals by analyzing lexical characteristics, highly associated predictive features, and salient topics. Our results reveal variation in lexical expression and feature importance related to self-harm across hospitals, des

Why this matters

Why now

The proliferation of AI models in healthcare, particularly in sensitive areas like mental health, is revealing inherent challenges in their real-world deployment and generalizability across diverse data environments.

Why it’s important

This research highlights a critical limitation of AI models trained on localized data, demonstrating performance degradation when applied to new institutions, which has significant implications for AI deployment scalability and reliability in healthcare.

What changes

The understanding that AI models for sensitive applications like self-harm prediction require more robust, generalizable training data and methods to account for lexical and semantic variations across institutions.

Winners

· AI model generalizability researchers
· Healthcare institutions with diverse data sets
· Patients benefiting from more robust AI predictions

Losers

· AI model developers ignoring data heterogeneity
· Healthcare systems relying on single-source AI solutions
· Patients harmed by non-generalizable AI predictions

Second-order effects

Direct

Increased focus on federated learning and transfer learning techniques in healthcare AI.

Second

Development of industry standards for bias detection and generalizability testing for healthcare AI models.

Third

Potential for healthcare AI regulation to include mandates for multi-institutional validation and lexical variation accounting.

Editorial confidence: 85 / 100 · Structural impact: 55 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.CL

#cs.CL

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.