top of page
Geometric Book and Document Composition.png

Publications

We just started. Research results will be displayed here. For now, we display our research proposals.

Novel AI Control Protocol Classes Evaluation and Scalability

Building on Greenblatt et al.'s control evaluation framework, we propose investigating alternative AI control protocol classes for AI control and their scaling properties as model capabilities increase. While the original work demonstrated effective safety-usefulness tradeoffs using simple trusted-untrusted model pairs, I hypothesize that hierarchical and parallel control structures may offer superior Pareto frontiers, particularly as the capability gap between trusted and untrusted models widens. I also suggest testing the original setup on different scales of capabilities to see if the results generalise over capabilities.

Safety Evals for Biological Foundation Models

This research proposal outlines a plan to create BioSafetyArena, the first comprehensive, open-source benchmarking framework for evaluating the safety of biological foundation models (BioFMs). These powerful AI models, capable of designing novel proteins and pathogens, currently lack the rigorous safety evaluations common for general-purpose AI, creating a significant dual-use risk.

Project Sentinel: An Open-Source Intelligence (OSINT) Framework for Monitoring Frontier AI and Forecasting Capabilities and Risks

This proposal outlines a research project to develop and implement a systematic Open-Source Intelligence (OSINT) monitoring framework focused on frontier AI laboratories. Currently, the most advanced AI models and their emergent capabilities are developed behind closed doors in corporate labs, creating a "hidden frontier" that is opaque to policymakers, safety researchers, and the public. This information asymmetry poses a significant challenge to proactive governance and risk mitigation. This project will establish a continuous monitoring system to collect and analyze publicly available data from frontier AI labs, identify leading indicators of capability advancements, and build predictive models to forecast the emergence of high-impact capabilities and associated risks.

Meta-research: Researching the Prospects of Different Research Directions

This proposal outlines a comprehensive metaresearch project designed to systematically analyze the relative prospects of different research directions in the field of AI safety.

A Formal Comparison of Natural Selection and Gradient Descent as Optimization Processes

We propose to formally model and compare the optimization dynamics of biological evolution by natural selection and deep learning by gradient descent. This research will develop a rigorous mathematical framework to compare these two optimization paradigms, focusing on a central hypothesis related to "informational bandwidth." We hypothesize that the compact, compressed nature of the human genome (billions of base pairs) forces natural selection to discover highly generalizable solutions and "fuzzy" value systems, whereas the massive parameter space of modern transformers (trillions of parameters) allows gradient descent to find highly specific, narrow, and potentially brittle solutions.

The Double-Edged Sword: A Framework for Analyzing and Forecasting Capability Spillovers from AI Safety Research

This proposal outlines a metaresearch project to investigate a critical, paradoxical dynamic within the AI ecosystem: the tendency for AI safety research to produce "spillover" effects that inadvertently accelerate general AI capabilities. While the goal of safety research is to mitigate risks, certain breakthroughs can be repurposed to enhance model performance, potentially exacerbating the very problems they were meant to solve. This project will develop a formal framework to assess this "capability spillover potential" (CSP), conduct a rigorous analysis of historical cases like Reinforcement Learning from Human Feedback (RLHF), and provide a forward-looking risk assessment of current and future safety research agendas.

In this article, we explore the landscape of biotech-based approaches to adult intelligence amplification — methods that could enhance The without relying on gene editing or brain–computer interfaces. My goal is to map biologically plausible interventions that might increase intelligence in fully developed humans to help with AI alignment. We examine three main domains: systems biology and genomic regulation (finding molecular control points for cognitive capacity), cellular and circuit-level enhancement (modulating neurons, glia, and myelination to improve information processing), and metabolic optimization (enhancing the brain’s energy efficiency through better substrate transport and regulation). We argue that meaningful gains will likely require a systems-level, AI-assisted modeling framework capable of integrating multiomic data to identify high-leverage interventions — a strategy that, while speculative, may represent the most realistic path toward safe and scalable intelligence enhancement in adults.

Join our mailing list for updates on publications and events

© 2025 by Theomachia Labs

bottom of page