
arXiv:2605.26445v1 Announce Type: new Abstract: Physicians learn primarily about illicit drugs from clinical overdose cases, limiting their understanding of real-world usage. Meanwhile, drug users share first-hand experiences online, offering insights into dosage and effects of drugs. To bridge this gap, we introduce ReDose (REddit Drug DOSe and Effect), a dataset of 6,435 Reddit posts on substance use. A board-certified toxicologist primarily annotated both the training and test sets, while two medical science students contributed to the test set, labeling DRUG, DOSE, and EFFECT entities. We
The proliferation of online platforms like Reddit provides rich, unstructured data that can be leveraged by AI and NLP techniques. This particular initiative is timely given the growing interest in real-world data for medical understanding.
This development offers a novel approach to gathering crucial drug-related information from user-generated content, potentially improving medical understanding of illicit drug use beyond clinical observations. This data can inform public health strategies and medical education.
Traditional methods of understanding drug usage are augmented by a new data source that captures real-world experiences, offering more nuanced insights into dosage and effects. The creation of a specialized dataset and annotation framework marks a step towards systematic extraction of such information.
- · Public health organizations
- · Medical researchers
- · Toxicologists
- · AI/NLP developers
- · Traditional drug monitoring approaches
Physicians and public health officials gain access to more comprehensive and real-time data on illicit drug use patterns.
Improved understanding could lead to more effective harm reduction strategies and targeted interventions.
The methodology could be extended to other public health issues, enabling broader data-driven insights from online communities.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL