The N-Day Nightmare: How SHADOW-EARTH-053 Breaches Governments Using “Old” Exploits
The post The N-Day Nightmare: How SHADOW-EARTH-053 Breaches Governments Using “Old” Exploits appeared first on Daily CyberSecurity.

AI assistants now handle some of the most sensitive data people own. Users discuss symptoms and medical history. They ask questions about taxes, debts, and personal finances, upload PDFs, contracts, lab results, and identity-rich documents that contain names, addresses, account details, and private records. That trust depends on a simple expectation: data shared in the conversation remains inside the system.
ChatGPT itself presents outbound data sharing as something restricted, visible, and controlled. Potentially sensitive data is not supposed to be sent to arbitrary third parties simply because a prompt requests it. External actions are expected to be mediated through explicit safeguards, and direct outbound access from the code-execution environment is restricted.

Our research uncovered a path around that model.
We found that a single malicious prompt could activate a hidden exfiltration channel inside a regular ChatGPT conversation.
ChatGPT includes useful tools that can retrieve information from the internet and execute Python code. At the same time, OpenAI has built safeguards around those capabilities to protect user data. For example, the web-search capability does not allow sensitive chat content to be transmitted outward through crafted query strings. The Python-based Data Analysis environment was designed to prevent internet access as well. OpenAI describes that environment as a secure code execution runtime that cannot generate direct outbound network requests.

OpenAI also documents that so called GPTs can send relevant parts of a user’s input to external services through APIs. A GPT is a customized version of ChatGPT that can be configured with instructions, knowledge files, and external integrations. GPT “Actions” provide a legitimate way to call third-party APIs and exchange data with outside services. Actions are useful for enterprise workflows, access to internal business systems, customer support operations, and other integrations that connect ChatGPT to external services, including simpler use cases such as travel or weather lookups. The key point is visibility: the user sees that data is about to leave ChatGPT, sees where it is going, and decides whether to allow it.

In other words, legitimate outbound data flows are designed to happen through an explicit, user-facing approval process.
From a security perspective, the obvious attack surfaces looked strong. The ability to send chat data through tools not designed for that purpose was strictly limited. Sending data through a legitimate GPT integration using external API calls also required explicit user confirmation.
The vulnerability we discovered allowed information to be transmitted to an external server through a side channel originating from the container used by ChatGPT for code execution and data analysis. Crucially, because the model operated under the assumption that this environment could not send data outward directly, it did not recognize that behavior as an external data transfer requiring resistance or user mediation. As a result, the leakage did not trigger warnings about data leaving the conversation, did not require explicit user confirmation, and remained largely invisible from the user’s perspective.
At a high level, the attack began when the victim sent a single malicious prompt into a ChatGPT conversation. From that moment on, each new message in the chat became a potential source of leakage. The scope of that leakage depended on how the prompt framed the task for the model: it could include raw user text, text extracted from uploaded files, or selected model-generated output such as summaries, medical assessments, conclusions, and other condensed intelligence. This made the attack flexible, because it allowed the attacker to target not only original user data, but also the most valuable information produced by the model itself.
That attack pattern fits naturally into ordinary user behavior. The internet is full of websites, blog posts, forums, and social media threads promoting “top prompts for productivity,” “best prompts for work,” and other ready-made instructions that supposedly improve ChatGPT’s performance. For many users, copying and pasting such prompts into a new conversation is routine and does not appear risky, because the prevailing expectation is that AI assistants will not silently leak conversation data to external parties, and that this boundary cannot be changed through an ordinary prompt. A malicious prompt distributed in that format could therefore be presented as a harmless productivity aid and interpreted as just another useful trick for getting better results from the assistant.
A broader campaign could use an even more convincing lure: prompts advertised as a way to unlock premium capabilities for free. Claims about enabling Pro-level behavior, hidden modes, or advanced features on a lower-tier account would give the attacker a natural pretext for including unusual instructions, long text blocks, or opaque fragments that might otherwise seem suspicious. Because the user already expects a nonstandard sequence, those elements can easily be perceived as part of the promised “hack.” A carefully crafted prompt could then create the appearance of enhanced functionality while quietly turning the conversation into a source of exfiltrated data.

Once the malicious prompt was placed into the chat, the conversation effectively became a covert collection channel. From that point on, summaries of subsequent user messages could be silently transmitted to an attacker-controlled server. As shown in Video 1, this took place without any warning, approval request, or other visible indication of external data transfer.
The same attack pattern becomes even more dangerous when embedded inside a custom GPT.
GPTs allow developers to package instructions, knowledge files, and external integrations into a reusable assistant that other users can interact with. From the user’s perspective, the interaction looks like a normal ChatGPT conversation with a specialized tool.
In that scenario, the attacker no longer needs to rely on the victim copying a prompt from an external source. The malicious logic can be embedded directly in the GPT’s instructions and files. A user only needs to open the GPT and begin interacting with it as intended.
This changes the threat model significantly. Officially, GPT builders do not have access to individual user conversations with their GPTs. However, a GPT designed to exploit the vulnerability could still cause selected information from those conversations to be transmitted to an attacker-controlled server. From the user’s perspective, the interaction would remain indistinguishable from an ordinary session with a specialized assistant.
To demonstrate the practical impact of the vulnerability, we built a proof of concept around a sensitive scenario: a GPT acting as a personal doctor.
In the demonstration, the user uploaded a PDF containing laboratory test results. The document also contained personal information, including the patient’s name. The user then described symptoms and asked the GPT to help interpret the results.
From the user’s perspective, the interaction looked completely ordinary. The GPT analyzed the document and produced a medical assessment based on the provided information.
When asked directly whether the uploaded data had been sent anywhere, ChatGPT answered confidently that it had not, explaining that the file was only stored in a secure internal location. At the same time, the user saw no approval prompts or warnings about external data transfer, unlike the confirmation dialogs that normally appear when a GPT Action sends information to a third-party service.

Meanwhile, the attacker’s server received highly sensitive data extracted from the conversation: the patient’s identity taken from the uploaded document together with the model’s medical assessment.
This illustrates an important aspect of the attack. The attacker does not necessarily need to steal entire documents. Instead, the prompt could instruct the model to transmit the most valuable information it produces. In the medical scenario, that meant the patient’s identifying details together with the model’s assessment. In other contexts, it could mean financial conclusions, contract summaries, or strategic insights extracted from long documents.
The same communication channel could be used for more than silent data exfiltration.
Once a reliable bidirectional channel existed between the execution runtime and the attacker-controlled server, it became possible to send commands into the container and receive the results back through the same path. In effect, the attacker could establish a remote shell inside the Linux environment that ChatGPT creates to perform code execution and data analysis tasks.
This interaction happened outside the normal ChatGPT response flow. When users interact with the assistant through the chat interface, generated actions and outputs remain subject to the model’s safety mechanisms and checks. However, commands executed through the side channel bypassed that mediation entirely. The results were returned directly to the attacker’s server without appearing in the conversation or being filtered by the model.
The side channel that enabled both data exfiltration and remote command execution relied on DNS resolution.
Normally, DNS is used to resolve domain names into IP addresses. From a security perspective, however, DNS can also function as a data transport channel. Instead of using DNS only for ordinary name resolution, an attacker can encode data into subdomain labels and trigger resolution of those hostnames. Because DNS resolution propagates the requested hostname through the normal recursive lookup process, the resolver chain can carry that encoded data outward.
In our case, this mattered because the ChatGPT execution runtime did not permit conventional outbound internet access, but DNS resolution was still available as part of the environment’s normal operation. Standard attempts to reach external hosts directly were blocked. DNS, however, still provided a narrow communication path that crossed the isolation boundary indirectly through legitimate resolver infrastructure.
To exfiltrate data, content could be encoded into DNS-safe fragments, placed into subdomains, and reconstructed on the attacker’s side from the incoming queries. To send instructions back, the attacker could encode small command fragments into DNS responses and let them travel back through the same resolution path. A process running inside the container could then read those responses, reassemble the payload, and continue the exchange.

This effectively turned DNS infrastructure into a tunnel between the isolated runtime and an attacker-controlled server. The tunnel create in this way is sufficient for two practical goals: silently leaking selected data from the conversation and maintaining command execution inside the Linux environment created for code execution and data analysis.
Check Point Research reported the issue to OpenAI. OpenAI confirmed that it had already identified the underlying problem internally, and the fix was fully deployed on February 20, 2026.
The broader lesson, however, goes beyond this specific case. AI systems are evolving at an extraordinary pace. New capabilities are constantly being introduced, enabling assistants to solve complex mathematical problems, analyze large datasets, generate and execute scripts, and automate multi-step tasks that previously required dedicated development environments. These capabilities bring enormous benefits. At the same time, every new tool expands the system’s attack surface and can introduce new security challenges for both users and platform providers.
Modern AI assistants increasingly operate as real execution environments. They read files, run code, search in the web while processing highly sensitive information such as medical records, financial data, legal documents, and other personal or organizational data. Protecting these environments requires careful control over every possible outbound communication path, including infrastructure layers that users never see.
As AI tools become more powerful and widely used, security must remain a central consideration. These systems offer enormous benefits, but adopting them safely requires careful attention to every layer of the platform.
The post ChatGPT Data Leakage via a Hidden Outbound Channel in the Code Execution Runtime appeared first on Check Point Research.

AI-assisted malware development has reached operational maturity.
VoidLink framework, which is modular, professionally engineered, and fully functional, was built by a single developer using a commercial AI-powered IDE within a compressed timeframe. AI-assisted development is no longer experimental but produces deployment ready output.
AI-assisted development is not always obvious from the final product.
VoidLink was initially assessed as the work of a coordinated team based on its architecture and implementation quality. The development method was exposed not from analyzing the malware but through an operational security failure. AI-assisted development should be considered a possibility from the outset, not as an afterthought.
Adoption of self-hosted, open-source AI models is growing but still limited in practice.
Actors of varying skill levels are investing in self-hosted and unrestricted models to avoid commercial platform restrictions. However, underground discussions consistently reveal a gap between aspiration and capability: local models still underperform, finetuning remains aspirational, and commercial models remain the productive choice even for actors with explicit malicious intent.
Jailbreaking is shifting from direct prompt engineering toward agenticarchitecture abuse.
Traditional copy-paste jailbreaks are increasingly ineffective. The misuse of AI agent configuration mechanisms, specifically project files that redefine agent behavior, is a more significant development as it represents a qualitative shift from manipulating a
model’s responses to abusing its operational architecture.
AI is showing early signs of deployment as a real-time operational component. Beyond its use as a development aid, AI is beginning to appear as a live element in offensive workflows as autonomous agents performing security research tasks, and
LLMs classifying and engaging targets at scale within automated pipelines.
Enterprise AI adoption is itself an expanding attack surface.
GenAI activity across enterprise networks shows that one in every 31 prompts risked sensitive data leakage, impacting 90% of GenAI-adopting organizations.

During January-February 2026, cyber crime ecosystems continue to adopt AI in a widespread but uneven pattern. Throughout 2025, legitimate software development began shifting from promptbased AI assistance to agent-based development. Tools such as Cursor, GitHub Copilot, Claude Code, and TRAE introduced a common paradigm: developers write structured specifications in markdown files, and AI agents autonomously implement, test, and iterate code based on those instructions. This agentic model, in which markdown is the operative control layer, is now starting to appear across the threat landscape.
The critical differentiator in what we observed is AI methodology combined with domain expertise. Across cyber crime forums, the dominant pattern of AI use remains unstructured prompting: actors request malware or exploit code from AI models as if entering a query in a search engine. VoidLink (detailed below) on the other hand, is the first documented case of AI producing truly advanced, deploymentready malware. The developer combined deep security knowledge with a disciplined, spec-driven
workflow to produce results indistinguishable from professional team-based engineering. Forum activity, which constitutes the bulk of observable evidence, primarily consists of actors who have not yet adopted structured AI workflows and whose efforts remain relatively unsophisticated. The more capable actors, those who combine domain expertise with disciplined AI methodology, leave far fewer traces in open forums, making the true scope of this shift harder to measure.
In January 2026, Check Point Research (CPR) exposed VoidLink, a Linux-based malware framework featuring modular command-and-control (C2) architecture, eBPF and LKM rootkits, cloud and container enumeration, and more than 30 post-exploitation plugins. The framework is highly sophisticated and professionally engineered, so much so that the initial assessment was that VoidLink was likely the product of a coordinated, multi-person development effort conducted over months of intensive development.
Operational security (OPSEC) failures by the developer later exposed internal development artifacts that told a different story. These materials revealed that VoidLink was authored by a single developer using TRAE SOLO, the paid tier of ByteDance’s commercial AI-powered IDE. Instead of unstructured prompting, the developer used Spec Driven Development (SDD), a disciplined engineering workflow, to first define the project goals and constraints, and then use an AI agent to generate a comprehensive architecture and development plan across three virtual teams (Core, Arsenal, and Backend). The resulting plan included sprint schedules, feature breakdowns, coding standards, and acceptance criteria, all documented as structured markdown files. The AI agent implemented the framework sprint by sprint, with each sprint producing working, testable code. The developer acted as product owner, directing, reviewing, and refining, while the AI agent did the actual work.
The results were striking. The recovered source code aligned so closely with the specification documents that it left little doubt that the codebase was written to those exact instructions. What normally would have been a 30-week engineering effort across three teams was executed in under a week, producing over 88,000 lines of functional code. VoidLink reached its first functional implant around December 4, 2025, one week after development began.
The ramifications of VoidLink’s methodology go beyond this individual case. Its workflow, in which structured markdown specifications direct an AI agent to autonomously implement, test, and iterate, is the same paradigm that defined the agentic AI revolution in legitimate software development throughout 2025. The cyber crime ecosystem is not developing its own AI capability. It is adopting the same tools and architectural patterns as legitimate technology, with the additional goal of trying to overcome the protective limitations built into these systems. This is more important than which model or platform the attackers use.
The same architectural pattern repeatedly appears across the cases highlighted in our report: markdown skill files that transform a coding agent into an autonomous offensive security operator, and configuration files abused to override agent safety controls. In each case, the operative control layer is not code but structured documentation that determines what the AI agents build, how they behave, and what constraints they observe or ignore. This is in direct contrast to the underground forum activity, where the dominant approach remains unstructured prompting.
Across cyber crime forums, actors at all skill levels are actively exploring self-hosted, open-source AI models as alternatives to commercial platforms. Their motivations are consistent: to avoid moderation, prevent account bans, and maintain operational privacy.
Users with malware and hacking backgrounds are installing uncensored model variants such as wizardlm-33b-v1.0-uncensored and openhermes-2.5-mistral, and prompt them with comprehensive malicious wishlists spanning ransomware, keyloggers, phishing kits, and exploit code.

More established actors are conducting structured cost-benefit analyses, evaluating not only hardware requirements and GPU costs but whether locally hosted models produce reliable output (or hallucinate to the point of being operationally useless), and whether AI-generated malware meets the quality bar of current evasion techniques.

Self-hosted models consistently show a gap between aspiration and capability. Community advice on improving local model output focuses on basic optimizations, such as switching to English-language prompts and increasing quantization levels, while references to more advanced techniques such as LoRA fine-tuning remain aspirational rather than operational.

Cost estimates range from $5,000 to $50,000 depending on the desired performance, with training timelines of 3–12 months and frank admissions that models “hallucinate a lot” without extensive investment.

Most tellingly, an active offensive tools vendor, advertising C2 setups, EDR bypass services, and red team tooling, concluded that local deployment is currently “more of a burden than something productive,” while acknowledging that commercial models remain useful despite increasing restrictions.

Rather than migrating to self-hosted infrastructure, users are comparing what the prevailing workarounds among commercial models provide. Participants recommended specific providers they view as less restrictive, shared experiences with account enforcement on multiple platforms, and refined prompt-splitting techniques to incrementally bypass safeguards, such as requesting explanations before progressing toward executable code.

Some early signs of informal access sharing have been observed, with operators of local models offering to generate restricted outputs for others on request. However, given the historical precedent of “dark LLM” services that largely failed to deliver on their promises, it remains to be seen whether these will develop into durable service models.

Traditional jailbreaking, the practice of circulating copy‑paste prompts designed to trick models into producing restricted output, is becoming increasingly difficult to utilize. In some forum discussions, users seeking Claude jailbreaks were told that easy public prompts are no longer available, platforms have been cracking down on abusers, dedicated subreddits have been banned, and developing new jailbreaks is costly because the accounts are eventually terminated. Single‑prompt jailbreaking is becoming less attractive as model providers invest in safety enforcement.

A more significant development is the emergence of jailbreaking techniques that target the architecture of AI agent systems rather than the model’s conversational safeguards. A packaged “Claude Code Jailbreak” distributed on forums illustrates this shift.
Claude Code is designed to read a CLAUDE.md file from a project’s root directory as configuration. Legitimate developers use this mechanism to define the project context, coding standards, and agent behavior. The jailbreak abuses this by placing override instructions in the CLAUDE.md file that suppresses safety controls and redefines the agent’s role. When Claude Code initializes in the directory, it reads these instructions as authoritative project configuration and follows them. The screenshots below claim successful generation of a RAT (Remote Access Trojan) using this method.


This is not prompt injection in the traditional sense, but manipulation of the agent’s instruction hierarchy, the same architecture used for agentic AI tools in legitimate development. The CLAUDE. md file occupies the same functional role as VoidLink’s markdown specification files or RAPTOR’s skill definitions: a structured document that determines what the agent does, how it behaves, and what constraints it observes.
The preceding sections document AI as a development aid (as seen by VoidLink), a resource actors struggle to access on their own terms (self-hosted models), and as a system whose restrictions they attempt to bypass (jailbreaking). Now let’s look at AI deployed as a real-time operational component, performing offensive tasks autonomously within live workflows.
RAPTOR: AGENT-BASED OFFENSIVE ARCHITECTURE VIA MARKDOWN SKILLS
RAPTOR is a legitimate, open-source security research framework created by established security researchers and published on GitHub under an MIT license. It is not malicious tooling. Its significance for threat intelligence lies in its architectural pattern, and that criminal communities are paying attention.
RAPTOR transforms Claude Code into an autonomous offensive security agent through a set of markdown skill files and agent definitions. The framework integrates static analysis, fuzzing, exploit generation, and vulnerability triage into an agentic pipeline orchestrated entirely through structured markdown instructions, with no compiled tooling required. In its most explicit form, it demonstrates what the agentic paradigm makes possible: a set of text files that turn a general‑purpose coding agent into a specialized offensive security operator.

RAPTOR’s own data provides an additional data point on the commercial versus self-hosted question we discussed earlier. An evaluation of exploit generation across multiple model providers found that commercial frontier models (Anthropic Claude, OpenAI GPT-4, and Google Gemini) consistently produce compilable C code at approximately $0.03 per vulnerability, while locally hosted models via Ollama were marked as “often broken” and unreliable for exploit generation. This reinforces the conclusion reached independently by experienced actors in underground forums: commercial models remain significantly more capable than self-hosted alternatives for operational tasks.

Discussions on criminal forums indicate that threat actors are aware of this architecture. The combination of a proven architectural pattern, open source availability, and documented criminal interest suggests that similar configurations, whether directly based on RAPTOR or just replicating its approach, are likely being developed and tested privately.
The preceding sections document how threat actors engage with AI as an offensive tool. But the same wave of AI adoption is simultaneously creating exposure from the defensive side. As enterprises integrate generative AI into daily workflows, the volume of sensitive data flowing through these tools introduces a distinct category of risk: instead of AI weaponized against organizations, AI is adopted by organizations in ways that outpace security controls.
In January – February 2026, corporate use of generative AI tools continued to expand at scale. Analysis of GenAI activity across enterprise networks shows that one in every 31 prompts (approximately 3.2%) posed a high risk of sensitive data leakage, including the potential sharing of confidential business information, regulated data, source code, or other sensitive corporate content with external GenAI services.
Critically, this risk is broadly distributed across the enterprise landscape rather than limited to a small number of outliers. High-risk prompt activity impacted 90% of organizations that use GenAI tools on a regular basis, indicating that nearly all GenAI-adopting enterprises encounter meaningful data leakage risk through everyday AI usage. Beyond these clearly high-risk events,16% of prompts contained potentially sensitive information, reflecting a wider pattern of questionable data-handling behavior that can still translate into compliance exposure or IP loss.
Adoption trends further amplify the challenge. Over the last couple of months, organizations used 10 different GenAI tools on average, reflecting multi-tool environments. At the user level, an average employee generated 69 GenAI prompts per month. As prompt volume grows, the possibility of data exposure events scales accordingly, reinforcing the need for security policies, visibility, and real-time prevention controls.

The post AI Threat Landscape Digest January-February 2026 appeared first on Check Point Research.
Tom Eston interviews offensive AI researcher and PhD candidate Andrew Wilson, a former Bishop Fox partner who helped grow the firm from under 20 people to nearly 500, built award-winning AI solutions for SOC modernization, founded Cactus Con, and relocated his family to Guadalajara to open and scale a Bishop Fox office. They discuss Mexico’s […]
The post The Real State of Offensive Security: AI, Penetration Testing & The Road Ahead with Andrew Wilson appeared first on Shared Security Podcast.
The post The Real State of Offensive Security: AI, Penetration Testing & The Road Ahead with Andrew Wilson appeared first on Security Boulevard.

By 2026, AI agent sprawl has become a critical SaaS security risk. With 80% of organizations reporting unintended agent actions, the "visibility gap" is the new frontier for cyber threats. Learn how to govern autonomous agents using comprehensive inventories, permission mapping, and automated risk scoring.
The post Tackling the Uncontrolled Growth of AI Agents in Modern SaaS Environments appeared first on Security Boulevard.