SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

arXiv:2605.29468v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to support scientific work, but it is unclear whether they uphold responsible conduct of research (RCR) norms or help undermine them. We introduce SciIntBench, an adversarial benchmark of 810 prompts across ten RCR categories and three scientific domains. Each scenario appears as an Overt Adversarial, Covert Adversarial, and Benign version, allowing us to jointly measure framing-sensitive refusal of misconduct and helpfulness on legitimate requests. We evaluate 16 commercial and open-weight LLM
As LLMs become more integrated into scientific workflows, the urgency to ensure their adherence to research integrity norms intensifies, driving the development of specialized benchmarks.
Ensuring LLMs uphold research integrity is critical for maintaining trust in AI-assisted scientific discovery and preventing the propagation of misinformation or biased research.
The introduction of SciIntBench provides a standardized, adversarial methodology for evaluating LLM compliance with research integrity, enabling developers and users to identify and mitigate risks.
- · AI ethicists
- · Scientific research institutions
- · LLM developers focused on integrity
- · LLMs lacking robust ethical safeguards
- · Researchers relying uncritically on AI-generated content
Increased scrutiny and demand for responsible AI development in scientific applications will follow.
New standards and certifications for 'research-integrity-compliant' LLMs may emerge, influencing market adoption.
The development of 'red-teaming' techniques and adversarial training for scientific AI could become a specialized field.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI