These bots coordinated their efforts and generated new bots as needed. Remarkably, they exploited real-world “zero-day” vulnerabilities—previously unidentified security flaws—during their attacks.
A few months prior, a research group published a paper showing how GPT-4 could autonomously exploit known security vulnerabilities (one-day or N-day vulnerabilities) that have no available fixes. Given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 independently exploited 87% of the most severe vulnerabilities.
In a follow-up study released this week, the researchers took their work further. They successfully hacked zero-day vulnerabilities using teams of autonomous, self-replicating Large Language Model (LLM) agents. This was achieved using a method called Hierarchical Planning with Task-Specific Agents (HPTSA). Instead of relying on a single LLM agent to manage multiple complex tasks, the HPTSA method employs a “planning agent” to oversee the process. This planning agent coordinates with a “managing agent,” which then delegates tasks to various “expert subagents.” Each subagent specializes in a particular task, reducing the workload on any single agent and enhancing efficiency.
When tested against 15 real-world web vulnerabilities, the HPTSA method was 550% more efficient at exploiting these vulnerabilities compared to a single LLM agent. The HPTSA approach successfully hacked 8 out of the 15 zero-day vulnerabilities, while the solo LLM managed to breach only 3 of the 15.
There are concerns that these AI models could be misused to attack websites and networks maliciously. Daniel Kang, a researcher and co-author of the study, highlighted that GPT-4, when used in chatbot mode, is “insufficient for understanding LLM capabilities” and cannot hack anything on its own. This is reassuring, as it means GPT-4’s hacking capabilities remain out of reach for the general public.