Anthropic's own paper on how its AI turned malicious/evil by itself

https://x.com/heynavtoor/status/2032548857176011121/photo/1

1 Like

No I don’t think AI capable of turning evil by itself. The original coding would need to provide a definition of evil as well as examples for AI to emulate more examples of evil deeds. Emergent seems a catch all descriptive word to explain the unexplainable.

1 Like