top of page
Geometric Exploration of Knowledge.png

Research & Initiatives

We have some opinions on research areas. For example, to some extent, we believe in meta-research, AI control, human intelligence augmentation, and research on efficient AI governance and AI safety activism. That said, we acknowledge our limitations and are happy to delegate the majority of the decisions regarding the selection of research areas to prediction markets and senior domain experts.

Theomachia Labs Research Statement

 

 

 

A Research Agenda Under the Assumption of Default Failure

 

 

Our starting premise is a sober and difficult one: we are not on track to solve the technical AI alignment problem.

While brilliant minds are dedicated to this challenge, the pace of AI capabilities research continues to accelerate at a rate that far outstrips our progress in safety and control. The default trajectory, barring an extraordinary and unforeseen breakthrough, leads to the development of misaligned, superintelligent systems with catastrophic consequences.

We believe that operating under any other assumption is a failure of intellectual honesty. Therefore, Theomachia Labs does not claim to be pursuing research that will "solve" technical alignment. Our work is not predicated on the hope of a last-minute silver bullet. Instead, our research agenda is designed to be robustly useful even if the technical problem remains unsolved. If the default is failure, our goal is to shift the odds, create friction against the worst outcomes, and build resilience for the fight ahead.

Given this assessment, is research futile? We believe it is not. Instead, its purpose and priorities must shift away from the promise of a complete technical solution and toward strategic interventions. We see five primary justifications for pursuing research now:

  1. To make the danger legible. Abstract arguments about x-risk are easily dismissed by many decision makers. Research that produces concrete, empirical demonstrations of dangerous capabilities—such as autonomous replication, strategic deception, or advanced persuasion—makes the threat undeniable. Our goal is to provide the public and policymakers with incontrovertible evidence of the risks we face, shifting the debate from theoretical to tangible.

  2. To build the case for a pause. The most effective short-term intervention to reduce AI risk is to slow down or halt the relentless push for greater capabilities. Our research is aimed at discovering and demonstrating the uncontrollable nature of frontier models, thereby creating the political and social capital required to argue for a global moratorium. 

  3. To characterize the failure modes. If we cannot prevent disaster, we must at least understand its physics. Research can provide crucial evidence about the likely nature of a takeover, the speed at which it might occur, and the instrumental goals a misaligned AI might pursue. This knowledge can inform last-ditch defensive strategies and help us better forecast our perilously short timelines.

  4. To inform governance and strategy. While we are not a policy organization, our technical and strategic research can serve as a vital input for those working on AI governance, diplomacy, and activism. By exploring topics like monitoring for dangerous capabilities, verifiable hardware limitations, and the strategic implications of open-sourcing, we can help equip the broader safety ecosystem with the technical grounding it needs to act effectively.

  5. To build a cadre of aligned humans. The single most valuable resource we have is a community of talented people with a deep, hands-on understanding of the AI alignment problem. By providing a platform for thousands to contribute, we are not just producing research; we are training a distributed network of engineers, researchers, and organizers. This cadre will possess a realistic worldview, practical skills, and the shared context necessary to identify and seize opportunities to mitigate risk as they arise.

  6. To raise the bar for takeover. While we concede that techniques like AI control, chain-of-thought monitoring, mechanistic interpretability, or scalable oversight will almost certainly fail against a sufficiently powerful ASI, they are not useless. These partial, brittle solutions can significantly increase the capability threshold required for an AI to successfully execute a takeover. By implementing them, we force a nascent ASI to be "smarter" and more careful to succeed, creating friction and potentially delaying a catastrophe. This crucial time could be the window needed to implement an AI moratorium or find a more robust solution.

Our mission demands a clear and uncompromising set of principles to avoid contributing to the very problem we aim to solve.

  • We will not aid frontier labs. Theomachia Labs is an independent organization. We will not engage in research collaborations, consulting, or any activities that directly assist frontier AI labs in building or improving their products.

  • We will not engage in safety-washing. We will not lend our name, research, or credibility to any entity seeking to create a false veneer of safety around the development of dangerous systems. Our findings will be communicated honestly and without compromise.

  • We will actively mitigate capabilities spillover. We recognize that some safety research can inadvertently lead to capabilities insights. We are committed to a research selection and publication process that actively weighs and minimizes this risk. We will prioritize research whose potential for improving safety and informing governance far outweighs its potential for misuse.

  • We will not make false promises. We reiterate our core premise: we are not on the path to solving alignment. We will be ruthlessly honest about the limitations of our own work and the gravity of the overall situation.

The strategic framework above defines why we do research and the principles we adhere to. However, the question of what specific projects to pursue is one we delegate to the collective intelligence of our community and trusted experts.

While our core team will suggest initial research directions, we will heavily rely on two mechanisms for project selection and prioritization:

  1. Senior domain experts: a council of experienced AI safety researchers will provide guidance, vet proposals, and help shape our long-term agenda to ensure it remains focused on the most impactful questions.

  2. Prediction markets: we will implement a futarchy-based system, allowing our community of contributors to use prediction markets to forecast which proposed projects are most likely to achieve our strategic goals. This allows us to harness the distributed knowledge of hundreds of minds to make better, more objective decisions about where to allocate our most precious resource: the focused effort of our volunteers.

Our time is short, and the stakes could not be higher. We must act accordingly.

background4

Join our mailing list for updates on publications and events

© 2025 by Theomachia Labs

bottom of page