Anthropic Developing Constitutional Classifiers to Safeguard AI Models From Jailbreak Attempts

Anthropic Developing Constitutional Classifiers to Safeguard AI Models From Jailbreak Attempts | Technology News

Anthropic Developing Constitutional Classifiers to Safeguard AI Models From Jailbreak Attempts Anthropic announced the development of a new system on Monday that can protect artificial intelligence (AI) models from jailbreaking attempts. Dubbed Constitutional Classifiers, it is a safeguarding technique that can detect when a jailbreaking attempt is made at the input level and prevent the AI from generating a harmful response as a result of it.

Technology News

Search This Blog

Anthropic Developing Constitutional Classifiers to Safeguard AI Models From Jailbreak Attempts | Technology News

Labels

Comments

Post a Comment

Popular posts from this blog

OpenAI Said to Be Working on Weekly and Lifetime ChatGPT Subscription Plans | Technology News

OpenAI Adds a Library Feature in ChatGPT to Let Users Find Their AI-Generated Images in One Place | Technology News

Amazon Summer Sale 2022 Ends May 8: Best Deals, Offers on Phones, Smart TVs, Laptops Before Sale Ends | Technology News