Table Top Agents
Luca De Leo![]()
AI-powered framework that accelerates AI governance scenario exploration through autonomous agent tabletop exercises, compressing preparation cycles from years to minutes.
We help students in Buenos Aires go from curious to published. Join research sprints, workshops, and a community of aspiring AI safety researchers.
From first steps to published researcher — here's how it works
Work by researchers connected to BAISH
Luca De Leo![]()
AI-powered framework that accelerates AI governance scenario exploration through autonomous agent tabletop exercises, compressing preparation cycles from years to minutes.
Joaquín Machulsky![]()
Master's thesis exploring AI safety through debate mechanisms, studying asymmetric capabilities and weak judges in the MNIST environment. Features an interactive demo.
Austin Meek, Eitan Sprejer
, Iván Arcuschin, Austin J. Brockmeier, Steven Basart
Investigating how well chain-of-thought reasoning can be monitored for safety through faithfulness and verbosity metrics.
María Victoria Carro, Denise Mester, Facundo Nieto, Oscar Stanchi, Guido Bergman, Mario Leiva, Eitan Sprejer
, Luca Forziati Gangi, et al.
Studying how AI systems' internal beliefs affect their persuasiveness in debate scenarios — implications for AI safety and deception.
Eitan Sprejer
, Fernando Avalos, Augusto Mariano Bernardi, José Pedro Brito de Azevedo Faustino, Jacob Haimes, Narmeen Fatimah Oozeer
A multi-judge approach to better approximate human preferences in AI systems, improving alignment evaluation.
Luca De Leo
, James Sykes, Balázs László, Ewura Ama Etruwaa Sam
A pipeline addressing CBRN evaluation vulnerabilities through consensus detection, verified cloze scoring, and statistical evaluation with bootstrap confidence intervals.
Luca De Leo
, Zoé Roy-Stang, Heramb Podar, Damin Curtis, Vishakha Agrawal, Ben Smyth
Stress-tested A Narrow Path Phase 0 ASI moratorium, identifying four circumvention routes and proposing ten mutually reinforcing policy amendments.
Our community continues to grow — we're building our publication track record through programs like AISAR and Apart Research sprints.
Current research directions in our community
Understanding how neural networks process information internally. What circuits implement specific behaviors? How can we reverse-engineer model cognition?
Building benchmarks and testing methodologies for frontier models. How do we measure alignment? What capabilities emerge at scale?
Fundamental questions about making AI systems beneficial. How do we specify human values? What oversight mechanisms work?
Get Involved
Want to contribute to AI safety research? Let us know your background and interests, and we'll connect you with relevant projects and collaborators.
Use our contact form to tell us about your background and research interests.
Contact UsWe review messages regularly and reach out when there's a good fit.
Book a call with one of our co-founders to discuss your interests and find the right path.