Google Cloud moves deeper into open source AI with Ai2 partnership
Google says having open-sourcing models, including their training data, as AI2 does will help it win over customers in government, healthcare, and financial services

Google has taken a further step into the world of open source AI models, striking a partnership with the Allen Institute of Artificial Intelligence (known by the acronym Ai2) that will see the non-profit AI lab offering its open source AI models and tools through Google’s Cloud Platform.
The Seattle-based Ai2 has developed some of the most capable and open large language models, which are called OLMo, and a family of multi-modal models called Molmo. Ai2 goes further than most players in the “open AI model” space, publishing all the data on which its models are trained as well as all of the model’s code.
Ai2 has also published the “recipes” for both training the base model and for further refinement of the model, known as “post-training,” that helps shape the format and tone of a model’s responses. "Post-training" is also used to implement guardrails that can make the model less likely to engage in unsafe outputs, such as telling someone how to build a bomb.
Other “open model” companies, such as Meta and AI startup Mistral, have released what-are known as “open weight” AI models. They publish the model’s code and the core numeric values that determine a model’s performance, but not the underlying data that was used to train the model or the exact recipe for how the model was trained.
Ai2’s most powerful model, which is OLMo 2 32B, which was released in March, outperforms OpenAI’s GPT-4o mini model, while using fewer computing resources to train than competing open weight models, according to benchmark tests the lab performed.
Google’s open source push
Google’s tie-up with Ai2 represents a further push by the tech giant into open source AI at a time when many large businesses and government organizations are attracted to open weight models because of the control it gives them over both data security and cost.
The demand for open weight models has only accelerated since January, when the Chinese AI startup DeepSeek released a highly-capable open weight model called R1 that excels at reasoning tasks and was less expensive to use than competing reasoning models.
Google’s most-prominent AI models are its Gemini family of proprietary models, which users can only query through an application protocol interface (API). These “closed models” are essentially the opposite of the sort Ai2 is offering. But Google has also dipped a toe in the open AI model world, releasing a family of open weight models called Gemma.
Just last month Google released Gemma 3, its latest version in this model family, which was essentially the company's answer to DeepSeek. Google Cloud has also previously hosted third-party open weight AI models in its Vertex AI "Model Garden."
Wary government customers
Karen Dahut, CEO of Google Public Sector, a division of Google Cloud that serves government entities and educational institutions, told Fortune that many government customers were wary of using AI models unless they had full transparency into models’ training data and could customize the models completely. Ai2’s models allow that. She said that beyond government customers, similar concerns would make Ai2’s models attractive to businesses in heavily-regulated sectors, such as healthcare, financial services, and insurance.
Last week, Ai2 and Google jointly announced that they were each giving $10 million to the Cancer AI Alliance, a consortium of leading cancer research centers and technology companies that were working on ways to use AI to advance cancer detection and treatment. Google is providing cloud computing infrastructure to the Cancer AI Alliance, while Ai2 is helping the consortium to train AI models.
Dahut said that Google Cloud’s public sector customers have told the company that they want to be able to experiment more fully with AI models than a proprietary API allows.
She also said that they are concerned with the security of AI models—including the possibility of data poisoning attacks, where someone seeds a training data set with malicious data that makes the trained model perform in an unexpected manner. The method can be used to essentially creak a kind of "backdoor" for an attacker to later exploit, getting the model to jump its guardrails or disgorge sensitive information. The fact that Ai2 publishing all its models’ training data allows governments to inspect this data for possible poisoning attempts.
Finally, she said, Google’s public sector customers want to be sure that any data they hold stays resident in their own systems and is not being used to train a larger model for other customers.
Ali Farhadi, the CEO Of Ai2, said the lab hopes to see “AI impacting critical sectors and domains in a way that it has not before.” He said that Ai2’s partnership with Google Cloud would allow Google’s customers to “customize to the extreme because the whole gamut is open.”
While fully open models have some security vulnerabilities—for instance they are more susceptible to prompt injection attacks, where an attacker uses a prompt that gets the model to jump its guardrails—Farhadi said he was convinced that making all aspects of an AI model, including the training data, available would help security researchers eventually find ways to make AI models more secure. He said it was similar to the way other open software, such as Linux, has become more secure because more people can work on finding and fixing vulnerabilities in its code.
This story was originally featured on Fortune.com