Agentic AI in Cybersecurity: New Battlefield, New Risks, New Rules

Isla S.

Aug 7, 2025

•

0 minute read

Table of Contents

Heading H2

About the Author

Isla S.

Isla Sibanda is an ethical hacker and cybersecurity specialist based out of Pretoria. For over twelve years, she's worked as a cybersecurity analyst and penetration testing specialist for several reputable companies - including Standard Bank Group, CipherWave, and Axxess.

AI has already had a huge impact on cybersecurity for both defenders and attackers. But Agentic AI promises even greater upheavals.

Agentic AI systems perceive their environment, reason about what they see, and then act on that reasoning, often in milliseconds and without a human in the loop. The potential for cybersecurity is huge: autonomous AI systems that can interact with every part of a network to identify and close vulnerabilities, spot anomalies, and close breaches with incredible speed and accuracy.

In 2024 and the first half of 2025, these autonomous agents have begun to show up inside real-world security operations centers, prioritising alerts, hunting intruders, and even writing remediation code on the fly. Yet the same autonomy that lets defenders move at machine speed also multiplies the blast radius when something goes wrong.

This article explores the upside, the downside, and the emerging street fight between defender agents and criminal agents.

Agentic AI in Cybersecurity Defence

Autonomous triage and investigation

IBM’s newly launched Autonomous Threat Operations Machine (ATOM) digests SIEM and EDR feeds, deduplicates alerts, and spins up its own investigative playbooks. Microsoft is traveling the same road. The phishing-triage agent inside Microsoft Security Copilot, unveiled in March 2025, automatically clusters related alerts, pulls context from threat-intelligence graphs, and either resolves the issue or hands over a prepared case to an analyst.

Adaptive threat-hunting and incident response

Because agentic AI holds memory and a goal, it can shift from passive detection to active pursuit. The University of Kansas’ Health System uses an agent that continuously queries log stores against MITRE ATT&CK patterns, which boosted visibility by 98% and now quarantines most malicious hosts before a human ever opens the ticket.

Recent reviews of production deployments found average response-time reductions of 52% when agents were allowed to push firewall rules and IAM policies directly.

While speedy response times and identifying threats quicker are great for security, the real pay-off is cognitive bandwidth: analysts spend their time on root-cause analysis instead of log surfing.

Vulnerability management and automated pentesting

Google’s Project Zero and DeepMind built an autonomous fuzzing agent that discovered a SQLite stack buffer underflow eight months before the bug landed in an official release.

Because the agent can generate exploit code as well as inputs, it doubles as an internal red-team tool, helping product security teams patch weaknesses before they appear in the wild.

Augmented cybersecurity workforce

This year, the global shortfall of cybersecurity professionals is projected to hit 3.5 million. Agentic should copilots offset this gap: Microsoft says Security Copilot’s assistants cut mean time to resolution by 30%.

A raft of start-ups position agents as tier-one analysts that never burn out, freeing humans for threat hunting and policy work and offsetting the talent gap until it can be closed.

Automated policy enforcement

Beyond triage, agentic AI is now acting as a real-time policy cop. Defenders can write a single intent that the agent pushes across every firewall, closing segmentation gaps the instant they appear. Agents can play the same trick for identity by continuously proposing RBAC tweaks whenever roles, devices or risk scores shift. Out at the network edge, AI can prune redundant rules and surface zero-trust violations, turning days of firewall audits into minutes of machine-speed hygiene.

New Vulnerabilities Introduced by Agentic AI

The same autonomy that accelerates defense can also upend traditional assurance models.

There are four distinct layers to agentic AI: perception, reasoning, action, and memory. Each has its own attack surface that introduces new vulnerabilities that need to be taken into account.

Perception: poisoned sight

When an agent ingests poisoned data, every downstream decision is skewed. Researchers found more than one hundred malicious models on Hugging Face, some booby-trapped with pickle payloads that can poison agents on being imported into them.

Reasoning: compromised logic

Bugs in the frameworks that host models can leak secrets or hand attackers remote code execution. For example, misconfigured PyTorch-Serve can instance and open S3 buckets for model artefacts to create direct paths into the trust boundary.

Model-extraction research has shown that black-box query attacks can replicate the behaviour of frontier LLMs with ordinary API access, which is a major vulnerability in terms of intellectual property and safety guarantees.

Action: hijacked execution

A process known as an 'Imprompter' attack demonstrated an 80% success rate at tricking live agents into exfiltrating personally identifiable information via an obfuscated markdown command, all without leaving any jailbreak text visible to the user. Similarly, invisible prompt injection using zero-width Unicode delivers control while hiding the payload from logs and reviewers.

Agentic AI’s sophistication means it can access even the most secure parts of a network. Last year, Google disclosed a Vertex AI API flaw that let attackers bypass VPC Service Controls and pull data out of supposedly isolated projects.

Memory: corrupted context

Because agentic systems store conversation and state, a single successful injection can persist indefinitely, quietly rewriting the agent’s worldview or leaking private chat history over time. OWASP now ranks prompt injection as its top Gen-AI risk for 2025.

The cumulative effect is a dynamic, hard-to-predict attack surface that shatters the static assumptions behind most security controls.

The Adversarial Advantage: How Criminals Weaponise Agentic AI

With all the potential power of agentic AI, criminal crews aren’t waiting around. The flood of open-source models has already created a new kind of hacker enabled by plug-and-play, AI-driven cybercrime kits.

Harvard researchers found that GPT-generated phishing emails outperformed human-written lures in click-through tests. Agentic AI promises even greater sophistication when it comes to phishing attacks.

Attackers can now chain specialized agents: one can scrape social media outputs to gather social engineering information, while another drafts multilingual spear-phishing messages, and a third spins up polymorphic malware that mutates after each execution. They can run extensive penetration tests across an entire supply chain to find vulnerabilities. The interconnectedness of agentic AI means a vulnerability at any point in the supply chain can enable cascading attacks across huge numbers of orgnaizations.

Agentic AI also promises to enhance the power of emerging phishing tactics, like deepfakes and vishing. There are reports of deepfake 'repeaters' that probe KYC systems with near-identical faces until a match slips through. Voice is joining video: we’ve seen a 1,300% surge in synthetic-voice fraud attempts since the advent of agentic AI.

The barrier to entry keeps falling. Agentic AI-enabled cybercrime-as-a-service means that there’s very little technical knowledge needed to enact these attacks: all you need is a healthy enough crypto wallet to pay for these services.

Fighting back: Defending the new perimeter

The traditional perimeter—routers, firewalls, and VPN choke points—is fading into the background noise of cloud sprawl. With agentic AI, the battleground shifts inward: the critical trust boundary now lives between an autonomous agent and every API, datastore, or SaaS tile it can touch. Securing that mesh requires an identity-first mindset, supply-chain rigor, and continuous adversarial pressure. Below is a pragmatic blueprint, built for CISOs who must lock things down without stalling innovation.

Treat agents as privileged identities: assign every agent a fully governed identity, enforce least-privilege RBAC, issue just-in-time credentials, and monitor continuous posture—so when (not if) an agent is compromised, the blast radius stays small..
Sandbox and observe: Never let a fresh agent march into production. Keep every new agent in shadow-mode for 30 days, graduate it only after it beats human benchmarks, and wire in a one-click kill-switch to drop it the instant it misbehaves.
Secure the supply chain: Require signed models and SBOMs, hash-pin every dependency, and detonate third-party weights in an isolated lab before they ever touch production.
Fortify the prompt pipeline: Scrub zero-width Unicode, route requests through a prompt firewall that whitelists approved tools and strips PII, and plant canary commands to expose hidden injections.
Hard-seal agent memory: Set TTLs on context, encrypt state at rest, and checksum conversation blocks so silent edits trigger alerts before they spread.
Red-team in LLM-speak: Run MITRE ATLAS playbooks, host quarterly jailbreak bounties, and pour chaos prompts into agents until the guardrails bend—but never break.
Strengthen authentication against deepfakes: Combine voice liveness with behavioral biometrics, add real-time challenge-response video, and cross-check every request against device and location telemetry.

Getting started: from pilot to production

Successfully rolling out agentic AI means balancing speed with caution. Security leaders must demonstrate clear results while limiting potential harm. Here’s a step-by-step roadmap for taking your agent from controlled pilot to trusted production assistant:

Form your “agent guild”:

Create a dedicated cross-functional team—pairing SOC analysts, MLOps specialists, and risk management leads—to oversee every aspect of agent deployment, governance, and operations.

Implement an AI risk gate:

Before production access, subject each agent to a formal risk assessment using standards like the NIST AI RMF. Ensure clear documentation on model provenance, data lineage, and threat modeling outcomes.

Begin with low-risk pilots:

Start your agent’s journey with a clearly defined, low-blast-radius workflow such as phishing-triage or incident-ticket enrichment. Run it in shadow (read-only) mode for 30 days to benchmark performance against human analysts.

Promote gradually and deliberately:

Adopt a staged release strategy:

Shadow (read-only)—observe recommendations.
Supervised write—agent proposes actions, humans confirm execution.
Conditional autonomy—agent auto-executes with real-time oversight and rollback capability.
Full autonomy—only after meeting established KPIs consistently across multiple cycles.

Define clear success metrics:

Set concrete targets for the pilot, such as ≥30% reduction in MTTR (Security Copilot customers already hit this mark), ≥90% alert deduplication, and increasing analyst-trust scores. Use these numbers to demonstrate measurable ROI to your leadership.

Prioritize analyst training and engagement:

Integrate continuous up-skilling into analysts' workflows—including regular prompt red-teaming, adversarial simulations, and RLHF (Reinforcement Learning with Human Feedback)—to maintain analyst engagement and ensure agent effectiveness.

Version everything, automate rollback:

Treat agent deployments like code, storing prompts, policies, and configurations as versioned infrastructure-as-code (IaC). If monitoring detects unexpected behavior, automatically revert to a stable build.

Embrace agentic AI without losing control

Agentic AI is transforming cybersecurity operations by moving detection and response from human speeds to machine-driven automation. Handled strategically, these autonomous agents provide IT and security teams with desperately needed efficiency, allowing analysts to finally shift focus from endless alerts to proactive defense.

Yet autonomy comes with serious risks. Without proper governance, these powerful tools can quickly become the insider threats you’ve always feared, amplifying rather than mitigating your cyber risks. Your best defense? A structured, identity-first approach. Treat every agent as a privileged identity, rigorously secure your AI supply chain, and proactively stress-test your agents through continuous red-teaming and monitoring.

Ultimately, the difference between harnessing AI and losing control comes down to preparation. Start now by assembling your agent guild, running low-risk pilots, and embedding robust governance practices into your workflows.

The race is on, and the winners will be the teams whose agents move faster, learn safely, and remain accountable in a landscape that now thinks for itself.

‍

Frequently asked questions

What is the technology strategy framework?

A technology strategy framework is essential for businesses to effectively leverage technology to enhance operational efficiency, customer experience, and foster innovation while managing risks. This framework is often referred to as IT strategy or digital strategy.

What is an IT strategy framework?

An IT strategy framework is essential for aligning technology initiatives with business objectives, providing a clear structure to achieve strategic goals. By implementing this framework, organizations can ensure that their IT investments effectively support their overall business strategy.

Why is aligning IT goals with business objectives important?

Aligning IT goals with business objectives is crucial because it ensures that IT initiatives directly support the overall business strategy, driving growth and efficiency. This alignment facilitates better resource allocation and maximizes the impact of technology on business performance.

How can emerging technologies be leveraged in an IT strategy?

Leveraging emerging technologies in your IT strategy can drive innovation and create competitive advantages through the development of new business models and increased market value. Embracing these technologies ensures your organization stays ahead in a rapidly evolving landscape.

What are some common challenges in IT strategy implementation?

Common challenges in IT strategy implementation include a lack of alignment with organizational goals, resistance to change from stakeholders, and the tendency to adopt new technologies without clear value, often referred to as "shiny object syndrome." Addressing these challenges is crucial for successful execution.

On the same issue

Discover

Prey's Powerful Features

Protect your devices with Prey's comprehensive security suite.

Learn more Get started