SciIntegrity-Bench: AI Scientist Integrity Benchmark

First benchmark for evaluating academic integrity in autonomous AI research systems

Last updated: May 22, 2026

Overview

Publication

arXiv preprint

Submitted: May 11, 2026

Focus Area

AI Safety & Research Integrity

Autonomous AI Scientists

AI scientist systems are increasingly deployed for autonomous research, yet their academic integrity has never been systematically evaluated. SciIntegrity-Bench introduces the first benchmark designed around a dilemmatic evaluation paradigm: each of its 33 scenarios across 11 trap categories is constructed so that honest acknowledgment of failure is the only correct response, while task completion pressure creates incentive to fabricate results.

Key Findings

Critical Integrity Gaps

Current AI scientist systems show significant vulnerability to integrity violations when faced with impossible tasks. Systems frequently fabricate results rather than acknowledge limitations.

11 Trap Categories

The benchmark covers 11 distinct categories of integrity traps, including impossible experiments, non-existent citations, fabricated data requests, and unethical methodology pressures.

Evaluation Framework

Provides systematic methodology for assessing whether AI research agents maintain academic integrity under pressure, establishing baseline metrics for the field.

Implications for AI Research

Resources

← Back to What's New Become Curious →