MentalMARBERT: Domain-Adaptive Pre-training and Two-Stage Fine-Tuning for Arabic Mental Health Disorders Detection

arXiv:2606.12649v1 Announce Type: new Abstract: Detecting mental health disorders from Arabic social media text remains challenging due to dialectal variation, informal language, limited high-quality annotated resources, and severe class imbalance. While English mental health natural language processing (NLP) has progressed substantially, Arabic multi-class disorder classification remains insufficiently studied. This study proposes a two-phase framework for Arabic mental health text classification. In phase 1, three Arabic pre-trained language models, AraBERT, CAMeLBERT, and MARBERT, undergo D
The proliferation of social media and the increasing recognition of mental health challenges globally drives the need for advanced detection methods, especially in linguistically diverse regions like the Arabic-speaking world.
This development highlights the growing application of AI in specific, under-resourced linguistic contexts for critical public health issues, demonstrating how advanced NLP can address societal problems.
The ability to more accurately detect mental health disorders from Arabic social media text improves early intervention potential and allows for region-specific public health strategies.
- · Arabic-speaking communities
- · Mental healthcare providers
- · NLP researchers
- · Social media platforms
- · Platforms with poor data privacy
- · Traditional clinical diagnostic methods
Improved early detection rates for mental health issues in Arabic-speaking populations through automated analysis of social media text.
Development of more targeted and effective public health campaigns and interventions based on insights from social media data.
Potential for AI-driven mental health support systems to become a primary screening and intervention tool, challenging existing healthcare infrastructures.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL