Skip to main content

Anthropic Developing Constitutional Classifiers to Safeguard AI Models From Jailbreak Attempts | Technology News

Anthropic Developing Constitutional Classifiers to Safeguard AI Models From Jailbreak Attempts Anthropic announced the development of a new system on Monday that can protect artificial intelligence (AI) models from jailbreaking attempts. Dubbed Constitutional Classifiers, it is a safeguarding technique that can detect when a jailbreaking attempt is made at the input level and prevent the AI from generating a harmful response as a result of it.

Comments

Popular posts from this blog

OpenAI Might Have Briefly Added New Custom Instruction Options to ChatGPT | Technology News

OpenAI Might Have Briefly Added New Custom Instruction Options to ChatGPT OpenAI might have added several new options to its Custom Instructions feature for ChatGPT on Thursday. Several netizens shared screenshots of these new options in custom instructions that allow users to further personalise the responses generated by ChatGPT. These new options include options to add the user’s nickname, profession, as well as personality traits.

OpenAI Improves File Search Controls for Developers, Said to Improve ChatGPT Responses | Technology News

OpenAI Improves File Search Controls for Developers, Said to Improve ChatGPT Responses OpenAI announced new changes to its File Search system last week, allowing more control to developers when asking the artificial intelligence (AI) chatbots to pick responses. The improvement has been made to the ChatGPT’s application programming interface (API) and will let developers not only check the behaviour of the chatbot’s response retrieval method, it also...