Meta Releases AI Model That Can Check Other AI Models’ Work
News Mania / Piyal Chatterjee / 19th October 2024
The owner of Facebook, Meta, announced on Friday that it was launching a number of new AI models from its research division, one of which is a “Self-Taught Evaluator” that might pave the way for a reduction in the amount of human intervention in the AI development process. The tool was first introduced by Meta in an August publication, which explained how it uses the same “chain of thought” technique as OpenAI’s recently published o1 models to provide accurate predictions about the responses of the models.
This method, which divides difficult problems into more manageable logical steps, seems to increase the precision of answers to difficult problems in disciplines like math, science, and coding.The evaluator model was trained by Meta’s researchers using just AI-generated data, removing human involvement at that point as well.Two of the Meta researchers working on the project told Reuters that the ability to use AI to evaluate AI reliably provides a glimpse of a potential road toward creating autonomous AI entities that can learn from their own mistakes.
Such agents are envisioned by many in the AI industry as digital assistants that possess the intelligence to do a wide range of tasks without the need for human participation.
Self-improving models could eliminate the need for Reinforcement Learning from Human Feedback, a currently employed, frequently costly, and ineffective method that depends on human annotators with specialized knowledge to correctly label data and confirm the accuracy of responses to challenging writing and math problems.
“We hope, as AI becomes more and more super-human, that it will get better and better at checking its work, so that it will actually be better than the average human,” said Jason Weston, one of the researchers.
“The idea of being self-taught and able to self-evaluate is basically crucial to the idea of getting to this sort of super-human level of AI,” he said.
Research on the idea of Reinforcement Learning from AI Feedback, or RLAIF, has also been published by other businesses, such as Google and Anthropic. However, those businesses typically don’t make their models available to the general public, unlike Meta. An update to Meta’s image-identification Segment Anything model, a tool that expedites LLM response production times, and datasets that can help with the development of novel inorganic materials were among the other AI capabilities the business unveiled on Friday.