What are open-weights AI models?

7 min read

Suggest changes in Google Docs

Open-weights AI models are models whose weights¹ have been published under some type of open license. Prominent examples of this type of model are Meta AI’s LLaMa models and some of Mistral AI’s models. They are typically contrasted to proprietary models with API-only access such as the recent models from OpenAI² and Anthropic³.

Are open-weights models open source?

There is some debate⁴ as to what open-weights models (if any) conform to the definition and principles of the Open Source movement and thus whether such models should be called “open-source”.

Open-source software is generally perceived as positive by programmers. Its advantages over proprietary software include:

It can easily be customized for specific purposes.
It allows collaborative improvement and fixing of bugs.
It is free, both as in beer⁵ and as in freedom.

There are a few ways that open-weights models differ from open-source software. Open-source projects have their source code exposed in a way that allows individual developers to find bugs and modify the project for their own use. Developers can in principle access and modify the weights of open-weights models; however, because these weights are inscrutable, they cannot be inspected for bugs or usefully modified by hand. In that way, open-weights models are more similar to an opaque compiled binary file than to legible source code. Furthermore, open-weights models generally don’t include the training data or procedure used to produce their weights.

On the flip side, open-weights models can be fine-tuned to better serve specific uses or to improve their performance. They also enable independent researchers to more easily conduct safety research and analysis compared to proprietary models. And finally, they are free.

Some models are more open than others

To complicate things further, as with open-source software, some open-weights models are more "open" than others, such as by having different licensing restrictions.For instance, Llama 2 and 3 did not allow distillation, but Llama 3.1 and Mixtral do through their Apache 2.0 release.

Furthermore, they may differ in the openness of their training data and procedure. Earlier models such as BERT used public training datasets, but most modern models use proprietary datasets and the Llama family’s training procedures were made fully public, whereas Mistral AI only released some of the parameters they used for training.

In summary, open-weights models share some open-source virtues like adaptability and free access, while also differing from standard open-source code in their inscrutability, lack of training transparency, and limited editability. Despite these differences, they have been generally well received by the open source community.

Is releasing model weights good for AI safety?

People in the AI safety community have differing intuitions regarding whether releasing model weights is positive or negative for safety.

On the positive side, releasing open-weights models allows:

Experimentation to better understand AI models and their properties in general, which might be necessary for some alignment strategies such as mechanistic interpretability
Public scrutiny of the models
Alignment research such as red-teaming by independent researchers⁶
Testing of the models' safety properties

State-of-the-art open-weights models also weaken the incentive to race in the same way that open content reduces the value of intellectual property.

On the negative side, such models are particularly susceptible to misuse, since once they are released, access to them can never be restricted again. While most frontier open-weights models are released with some fine-tuning that aims to prevent misuse, this fine-tuning is cheap and easy to remove. It could turn out that future models are so capable at a variety of dangerous tasks that they should simply not be released.⁷ Additionally, publishing open-weights models, and especially fully open-source models with training methodologies, could accelerate AI capabilities advancement by revealing valuable techniques and architectures that competitors can build upon, potentially leading to faster overall AI development.