BAISH Logo
BuenosAiresAISafetyHub
AboutProgramsResearchResourcesContact
EnglishEspañol
Home / Research

Research at BAISH

We help students in Buenos Aires go from curious to published. Join research sprints, workshops, and a community of aspiring AI safety researchers.

Join a ProgramView Publications

Your Research Journey

From first steps to published researcher — here's how it works

01
Learn
AI Safety Fundamentals
02
Practice
AIS Research Workshop
You're here
03
Research
Research Sprints
04
Launch
AISAR & Careers ↗
Step 01
Learn
Step 02
Practice
You're here
Research
Step 04
Launch

Community Publications

Work by researchers connected to BAISH

Apart Research HackathonNov 2025

Table Top Agents

Luca De LeoBAISH

AI-powered framework that accelerates AI governance scenario exploration through autonomous agent tabletop exercises, compressing preparation cycles from years to minutes.

Apart
Master's ThesisOct 2025

Explorando AI Safety via Debate: un estudio sobre capacidades asimétricas y jueces débiles en el entorno MNIST

Joaquín MachulskyBAISH

Master's thesis exploring AI safety through debate mechanisms, studying asymmetric capabilities and weak judges in the MNIST environment. Features an interactive demo.

Website
arXivOct 2025

Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity

Austin Meek, Eitan SprejerBAISH, Iván Arcuschin, Austin J. Brockmeier, Steven Basart

Investigating how well chain-of-thought reasoning can be monitored for safety through faithfulness and verbosity metrics.

arXiv
arXivOct 2025

AI Debaters are More Persuasive when Arguing in Alignment with Their Own Beliefs

María Victoria Carro, Denise Mester, Facundo Nieto, Oscar Stanchi, Guido Bergman, Mario Leiva, Eitan SprejerBAISH, Luca Forziati Gangi, et al.

Studying how AI systems' internal beliefs affect their persuasiveness in debate scenarios — implications for AI safety and deception.

arXiv
NeurIPS WorkshopSep 2025

Approximating Human Preferences Using a Multi-Judge Learned System

Eitan SprejerBAISH, Fernando Avalos, Augusto Mariano Bernardi, José Pedro Brito de Azevedo Faustino, Jacob Haimes, Narmeen Fatimah Oozeer

A multi-judge approach to better approximate human preferences in AI systems, improving alignment evaluation.

arXiv
Apart Research HackathonSep 20252nd Place

RobustCBRN Eval: A Practical Benchmark Robustification Toolkit

Luca De LeoBAISH, James Sykes, Balázs László, Ewura Ama Etruwaa Sam

A pipeline addressing CBRN evaluation vulnerabilities through consensus detection, verified cloze scoring, and statistical evaluation with bootstrap confidence intervals.

Apart
Apart Research HackathonJun 20251st Place

Four Paths to Failure: Red Teaming ASI Governance

Luca De LeoBAISH, Zoé Roy-Stang, Heramb Podar, Damin Curtis, Vishakha Agrawal, Ben Smyth

Stress-tested A Narrow Path Phase 0 ASI moratorium, identifying four circumvention routes and proposing ten mutually reinforcing policy amendments.

Apart

Our community continues to grow — we're building our publication track record through programs like AISAR and Apart Research sprints.

What You Could Work On

Current research directions in our community

Mechanistic Interpretability

Understanding how neural networks process information internally. What circuits implement specific behaviors? How can we reverse-engineer model cognition?

LLM Evaluations

Building benchmarks and testing methodologies for frontier models. How do we measure alignment? What capabilities emerge at scale?

Alignment Theory

Fundamental questions about making AI systems beneficial. How do we specify human values? What oversight mechanisms work?

Get Involved

Express Interest in Research

Want to contribute to AI safety research? Let us know your background and interests, and we'll connect you with relevant projects and collaborators.

Use our contact form to tell us about your background and research interests.

Contact Us

We review messages regularly and reach out when there's a good fit.

Ready to start your research journey?

Book a call with one of our co-founders to discuss your interests and find the right path.

Eitan Sprejer

Eitan Sprejer

Interpretability & Evaluations

Book with Eitan
Luca De Leo

Luca De Leo

Operations & Strategy

Book with Luca
BAISH Logo

Buenos Aires AI Safety Hub

© 2025 BAISH. All rights reserved.

AboutProgramsResearchResourcesContact
Privacy Policy