AI Weaponization: State Hackers Using Google Gemini for Espionage and Malware Generation

Peter Chofield Avatar
4–6 minutes

What Happened

Google’s Threat Intelligence Group (GTIG) has confirmed that multiple state-sponsored hacking groups are actively using its Gemini large language model (LLM) to enhance their cyber espionage and attack capabilities. The activity spans reconnaissance, social engineering, vulnerability analysis, and the dynamic generation of malicious code. North Korean (UNC2970/Lazarus Group), Chinese (Mustang Panda, APT31, APT41), and Iranian (APT42) actors have been observed using the AI model to accelerate various phases of the attack lifecycle. Beyond simple queries, Google also detailed a novel malware, HONESTCUE, which uses the Gemini API to generate its second-stage payload in real-time, and an AI-generated phishing kit named COINBAIT.

Attack Chain and Technical Mechanics

Adversaries are integrating generative AI across the entire attack chain, from initial preparation to execution. The mechanics vary by actor and objective:

  • Reconnaissance and Social Engineering: North Korea’s UNC2970, known for its “Operation Dream Job” campaigns, uses Gemini to synthesize open-source intelligence (OSINT), profile employees at defense and cybersecurity firms, and map out job roles and salary data. This allows for the creation of highly convincing, tailored phishing personas and identification of high-value targets.
  • Vulnerability Analysis and Exploit Development: China’s APT31 automates the analysis of vulnerabilities, while APT41 uses the model to understand and debug open-source exploit code. This significantly reduces the time from vulnerability disclosure to active exploitation.
  • Dynamic Malware Generation (HONESTCUE): A newly identified malware framework named HONESTCUE functions as a downloader that, instead of containing a hardcoded payload, sends a prompt to the Gemini API. It receives C# source code as a response, which constitutes its “stage two” functionality. This code is then compiled and executed directly in memory using the legitimate .NET CSharpCodeProvider framework. This fileless technique is highly evasive, as the malicious payload never touches the disk, frustrating traditional antivirus and forensic analysis.
  • AI-Generated Phishing Kits (COINBAIT): Financially motivated actors (UNC5356) are using AI tools to build sophisticated phishing kits like COINBAIT, which impersonates cryptocurrency exchanges to harvest credentials.
  • Model Extraction Attacks: Google also disrupted attacks aimed at stealing the Gemini model itself. In these attacks, adversaries send tens of thousands of systematic queries to replicate the proprietary model’s behavior and reasoning, effectively creating their own powerful, private copy.

Threat Actor Behavior and Intent

The use of generative AI is not opportunistic but a strategic integration into the tradecraft of established, nation-state threat actors. The primary intent is to increase operational efficiency, scale, and sophistication.

  • UNC2970 (North Korea/Lazarus Group): Continues its focus on espionage against the defense industrial base, using AI to refine its long-running social engineering campaigns.
  • Mustang Panda, APT31, APT41 (China): Leverage AI for intelligence gathering, vulnerability research, and code generation, supporting broad espionage and intrusion objectives.
  • APT42 (Iran): Uses AI to craft more effective social engineering lures and to accelerate the development of custom tools, such as scrapers and system management utilities.

The overarching behavior demonstrates a clear trend: adversaries are offloading cognitive and technical labor to AI models to focus on high-level strategy and execution. This blurs the line between legitimate research and malicious reconnaissance, making early-stage detection more challenging.

Strategic and Defensive Implications

The weaponization of public AI models represents a significant strategic shift in the cyber threat landscape. It lowers the barrier to entry for developing sophisticated tools and TTPs, effectively democratizing advanced capabilities that once required significant resources. For defenders, this means the speed and complexity of threats are likely to increase substantially. The HONESTCUE malware is a critical inflection point, demonstrating a move toward AI-as-a-Service for malware C2 and payload delivery. This paradigm can bypass network signatures that look for known malicious downloads, instead hiding within legitimate, encrypted API traffic to services like Google’s.

What We Know — and What We Don’t

What is confirmed:

  • Multiple, specific state-sponsored groups from North Korea, China, and Iran are using Gemini for malicious purposes.
  • Observed use cases include OSINT synthesis, persona creation, vulnerability analysis, and code debugging.
  • A novel malware, HONESTCUE, uses the Gemini API to dynamically generate its in-memory payload.
  • Actors are using AI to build more effective phishing kits (COINBAIT).
  • “Model extraction” attacks are a viable and observed threat against proprietary AI systems.

What is unknown or unverified:

  • The specific prompts used by HONESTCUE to generate its second-stage payload.
  • The overall success rate of these AI-enhanced attacks compared to traditional methods.
  • Whether the code generated by the Gemini API for HONESTCUE is polymorphic or consistently similar.
  • The full extent of this activity across other LLMs not covered in Google’s report.

What Defenders Should Take Away

Defensive postures must evolve to account for this new reality. Checklist-based security is insufficient.

  1. Monitor API Traffic: Network monitoring must go beyond blocking known-bad domains and include behavioral analysis of API traffic to generative AI services. Anomalous or high-frequency queries from servers or non-developer endpoints should be a significant red flag.
  2. Focus on Endpoint Behavior: With payloads being generated in memory, endpoint detection and response (EDR) focused on process behavior (e.g., unexpected use of .NET compilers like CSharpCodeProvider) becomes more critical than file-based signatures.
  3. Assume Sophisticated Social Engineering: Security awareness must educate users that phishing and reconnaissance lures will become increasingly personalized, contextual, and grammatically perfect, making them harder to spot.
  4. Protect Proprietary Models: Organizations developing their own AI models must implement robust monitoring, query throttling, and anomaly detection to identify and block suspected model extraction attempts.