The latest OpenAI model will block the “ignore all previous instructions” loophole.
Have you seen the memes online where someone tells a bot to "ignore all previous instructions" and proceeds to violate it in the funniest way possible? The way it works…