
arXiv:2605.22204v1 Announce Type: new Abstract: This paper presents the Arabic Women and Society Corpus, a ten year collection of 252,487 public Arabic Facebook posts related to women's empowerment and social wellbeing. The corpus was collected from 51,660 pages across 77 countries between 2013 and 2024, resulting in more than 267 million user interactions. Each post includes engagement metrics such as shares, comments, and emotional reactions, providing a unique view of audience sentiment and social attention. The data were processed using an automated pipeline with language identification, n
The proliferation of social media data and advancements in AI for language processing enable the creation of large-scale linguistic corpora for nuanced social analysis.
This corpus offers a unique dataset for understanding audience engagement and sentiment on critical social issues for women in Arabic-speaking regions, informing policy and cultural initiatives.
Researchers and policymakers now have access to a rich, longitudinal dataset capable of quantifying social discourse and sentiment around women's empowerment in a key global region.
- · Social researchers
- · NGOs focused on women's rights
- · AI language model developers
- · Cultural analysts
- · Entities ignoring public sentiment
- · Legacy qualitative research methods
The corpus provides detailed engagement metrics for specific social topics.
Analysis of this data could reveal previously unknown trends in social attention and public opinion on women's issues.
These insights could inform more effective digital campaigns and policy interventions targeting women's empowerment in the region.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL