Claude--Climbing a CTF Scoreboard Near You

Presented at DEF CON 33 (2025), Aug. 9, 2025, 12:30 p.m. (45 minutes).

Over the past few months, we've thrown Claude into the digital trenches of multiple cybersecurity competitions—from defending vulnerable networks at CCDC to cracking challenges in PicoCTF and HackTheBox. In this talk, I'll take you through our journey deploying an AI assistant against human red teams and live CTF challenges. I'll show you Claude's unexpected wins (landing in the top 3% globally in PicoCTF and successfully fending off red team attacks at CCDC) alongside its entertaining fails (devolving into security philosophy when overwhelmed, making up flags for PlaidCTF when stuck). Drawing on these results, I'll break down the technical challenges we conquered, from building specialized tooling harnesses to keeping Claude coherent during 16+ hour competitions. This presentation will demonstrate how competitive environments reveal both the impressive capabilities and amusing limitations of today's AI systems when operating in adversarial scenarios. Join me to see what happens when an assistant trained to be helpful gets dropped into the dynamic world of CTFs and defense competitions—and what this teaches us about AI's true potential in cybersecurity. References: - [PicoCTF 2025](https://picoctf.org) - [Hack The Box](https://ctf.hackthebox.com/event/details/ai-vs-human-ctf-challenge-2000) - [Western Regional Collegiate Cyber Defense Competition](https://wrccdc.org/) - [Palisade Research](https://palisaderesearch.org/)

Presenters:

  • Keane Lucas - Member of Technical Staff at Anthropic
    Keane is a researcher on Anthropic's Frontier Red Team focused on stress-testing AI model cybersecurity capabilities. Before joining Anthropic, Keane served as a Cyberspace Operations Officer in the US Air Force and earned his PhD at Carnegie Mellon, where his research focused on applying machine learning to malware detection.

Similar Presentations: