A few days after Sam Altman was rehired as OpenAI CEO, it was announced that a new board would be constituted. About three weeks later on Monday (December 18), the company laid out a framework to address safety in its most advanced models, including allowing the board to reverse safety decisions.

This means that any model, such as the much reported Project Q*, which may be seen as potentially harmful for humanity can be shelved by the board’s veto power.

“We need to approach AI safety from first principles, using AI itself to solve AI safety challenges and building general solutions for categories of problems,” the company said.

The Microsoft-backed company said that it will deploy its latest technology only if it is deemed safe in specific areas such as cybersecurity and nuclear threats.

OpenAI’s Safety Systems team
The company is also creating an advisory group, called Safety Systems team, that will review safety reports and send them to the company’s executives and board. While executives will make decisions, the board can reverse those decisions.

“The Safety Systems team is dedicated to ensuring the safety, robustness, and reliability of AI models and their deployment in the real world,” OpenAI added.

Safety Systems consists of four subteams
OpenAI said that this main team will have four sub-teams which includes experts in engineering, research, policy, human-AI collaboration and product management.

Safety Engineering: The team implements system-level mitigation into products, builds a secure, privacy-aware, centralised safety service infra, and creates ML-centric toolings for investigation and enforcement at scale.

Model Safety Research: This team will advance OpenAI’s capabilities for precisely implementing robust, safe behaviour in our models.

Safety Reasoning Research: This team will detect and understand risks, both knowns and unknowns, to guide the design of default safe model behaviour and mitigations. It will work towards the goal by building better safety and ethical reasoning skills into the foundation model.

Human-AI Interaction: Finally, this team will take care of policy which is the “interface for aligning model behaviour with desired human values and we co-design policy with models and for models, and thus policies can be directly plugged into our safety systems.”

(With agency inputs)


end of article

Source link