Large Language Models (LLMs), the technology that powers AI chatbots, have shown capabilities in solving complex tasks. However, they sometimes suffer from hallucinations, a phenomena where the results returned are plausible but factually incorrect — limiting their wider use in various industries. Google DeepMind, the company’s AI research arm has now announced a new model which is aimed at reducing hallucinations and reply with answers that “surpass the best known results in important problems.”

Called FunSearch, this large model “searches for programs that describe how to solve a problem, rather than what the solution is”, Google explains in a research paper. The company demonstrated this by using the model and discovering a solution to a long-standing scientific puzzle by producing verifiable and valuable new information that did not previously exist.

How FunSearch works
FunSearch (short for searching in the function space), pairs a pre-trained LLM with a systematic evaluator. It combines a LLM called Codey, a version of Google’s PaLM 2 that is fine-tuned on computer code, with systems that reject incorrect or nonsensical answers and plug good ones back in.

A second algorithm then checks and scores what Codey comes up with. The best suggestions – both correct and incorrect – are saved and given back to Codey.

“Many will be nonsensical, some will be sensible, and a few will be truly inspired. You take those truly inspired ones and you say, ‘Okay, take these ones and repeat,’” explained paper coauthor Pushmeet Kohli, vice president of research at Google DeepMind.

To test the model, the researchers used FunSearch to approach a math problem: the bin packing problem. The problem involves trying to pack items into as few bins as possible. The researchers said that FunSearch came up with a way to solve it that’s faster than human-devised ones.

How FunSearch may be used in future
With hallucination problems, LLMs are not used in scientific discovery and solving difficult math problems. The new LLM may provide a gateway to deploy such programs in real-world applications.


end of article

Source link