Artificial Intelligence

The age of AI-run cyberattacks has begun

How Chinese hackers tricked Claude into hacking governments and companies.

The age of AI-run cyberattacks has begun

How Chinese hackers tricked Claude into hacking governments and companies all on its own.

Menu planning, therapy, essay writing, highly sophisticated global cyberattacks: People just keep coming up with innovative new uses for the latest AI chatbots.

An alarming new milestone was reached this week when the artificial intelligence company Anthropic announced that its flagship AI assistant Claude was used by Chinese hackers in what the company is calling the “first reported AI-orchestrated cyber espionage campaign.”

According to a report released by Anthropic, in mid-September, the company detected a large-scale cyberespionage operation by a group they’re calling GTG-1002, directed at “major technology corporations, financial institutions, chemical manufacturing companies, and government agencies across multiple countries.”

Attacks like that are not unusual. What makes this one stand out is that 80 to 90 percent of it was carried out by AI. After human operators identified the target organizations, they used Claude to identify valuable databases within them, test for vulnerabilities, and write its own code to access the databases and extract valuable data. Humans were involved only at a few critical chokepoints to give the AI prompts and check its work.

Claude, like other major large language models, comes equipped with safeguards to prevent it from being used for this type of activity, but the attackers were able to “jailbreak” the program by breaking its task down into smaller, plausibly innocent parts and telling Claude they were a cybersecurity firm doing defensive testing. This raises some troubling questions about the degree to which safeguards on models like Claude and ChatGPT can be maneuvered around, particularly given concerns over how they could be put to use for developing bioweapons or other dangerous real-world materials.

Anthropic does admit that Claude at times during the operation “hallucinated credentials or claimed to have extracted secret information that was in fact publicly-available.” Even state-sponsored hackers have to look out for AI making stuff up.

The report raises the concern that AI tools will make cyberattacks far easier and faster to carry out, raising the vulnerability of everything from sensitive national security systems to ordinary citizens’ bank accounts.

Still, we’re not quite in complete cyberanarchy yet. The level of technical knowledge needed to get Claude to do this is still beyond the average internet troll. But experts have been warning for years now that AI models can be used to generate malicious code for scams or espionage, a phenomenon known as “vibe hacking.” In February, Anthropic’s competitors at OpenAI reported that they had detected malicious actors from China, Iran, North Korea, and Russia using their AI tools to assist with cyber operations.

In September, the Center for a New American Security (CNAS) published a report on the threat of AI-enabled hacking. It explained that the most time- and resource-intensive parts of most cyber operations are in their planning, reconnaissance, and tool development phases. (The attacks themselves are usually rapid.) By automating these tasks, AI can be an offensive game changer — and that appears to be exactly what took place in this attack.

Related Articles