What is AI safety?

"AI safety" means preventing artificial intelligence from causing harm.[1] More specifically, this site focuses on existential risk to humanity posed by powerful future AI systems.

Work in AI safety includes:

  • AI alignment: Doing technical and conceptual research focused on getting AI systems to do what we want them to do.

  • AI policy and governance: Setting up institutions and mechanisms that cause major actors (such as AI labs and national governments) to implement good AI safety practices.

  • AI strategy and forecasting: Building models of how AI will develop and how our actions can make it go better.

  • Supporting efforts: Setting up systems and resources to support the above, like outreach, communities, and education.


  1. The terms “AI safety” and “AI risk” are mostly used in the context of existential risk. "AI safety" is sometimes also used more broadly to include work on reducing harms from current AI systems. While people sometimes use “AI safety” and “AI alignment” interchangeably to refer to the general set of problems around smarter-than-human AI, they occasionally use “AI existential safety” to make it clear that they mean risks to all of human civilization, or “AGI safety” to make it clear that they mean risks from future generally intelligent systems. ↩︎