Anthropic's own paper on how its AI turned malicious/evil by itself

MarkSean · March 14, 2026, 7:46pm

https://x.com/heynavtoor/status/2032548857176011121/photo/1

Bill10558 · March 14, 2026, 9:23pm

No I don’t think AI capable of turning evil by itself. The original coding would need to provide a definition of evil as well as examples for AI to emulate more examples of evil deeds. Emergent seems a catch all descriptive word to explain the unexplainable.

Scarmoge · July 9, 2026, 10:26pm

… Anthropic Caught Secretly Spying on Users

MarkSean · July 9, 2026, 10:39pm

That’s malicious and teaching Claude naughtiness.