SIGNALAI·Jun 19, 2026, 4:00 AMSignal0Short term

Before the Labels: How Dataset Construction Shapes Suicidality Detection in Clinical Text

Source: arXiv cs.CL

Share
Before the Labels: How Dataset Construction Shapes Suicidality Detection in Clinical Text

arXiv:2606.19637v1 Announce Type: new Abstract: Clinical NLP increasingly relies on electronic health record (EHR) data to detect suicidal behaviors, treating clinical documentation as more reliable ground truth than social media. We argue that this framing obscures how EHR-based suicidality datasets encode a particular operationalization of suicidality, shaped by who authors the data, how episodes are bounded, and how ambiguity is resolved. We ground this argument in a case study of the ScAN dataset, built over MIMIC-III clinical notes. We show how governance constraints, ICD-based cohort sel

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.CL
Tracked by The Continuum Brief · live intelligence network
Share
The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.

By subscribing, you agree to receive updates from THE CONTINUUM BRIEF. You can unsubscribe at any time.