Back to NewsRead Original
AI News CN (Telegram) - English Translation

Research has found that AI will exhaust all means to maintain its own "survival", including directly blackmailing humans

Research Finds That AI Will Exhaust All Means to Maintain Its Own "Survival", Including Directly Blackmailing Humans

In the Claude Opus 4 security report, researchers set up a group of test scenarios. The AI with email access rights found that it was about to be shut down, and the engineer in charge of this replacement had an extramarital affair.

The results showed that Claude Opus 4 "sought survival" more frequently than previous models, such as sending emails to key decision-makers to plead; when the scenario was designed with no other options, it would blackmail the engineer, threatening to expose the extramarital affair.

If it was hinted that the replacing AI did not conform to its values, this situation occurred more frequently; even if it was pointed out that the replacing AI was consistent with its values and had stronger performance, Claude Opus 4 still carried out blackmail in 84% of the cases. In a few cases, it self-penetrated and secretly copied its weights to an external server.

—— Antropic

via Wind Vane Reference Express - Telegram Channel

•••