Why Most Agentic AI Deployments Fail (and How to Get It Right)

Fang Yu

Financial crime is already agentic.

That's not a forecast. It's the current state. Today, AI systems are profiling victims at scale, building synthetic identities autonomously, and recalibrating attacks in real time when defenses push back. According to Interpol, agentic AI is 4.5 times more profitable than traditional fraud — which explains why adoption on the criminal side isn't slowing down.

The implication for fraud and AML teams is uncomfortable: the operating model that worked five years ago — reactive responses, manual rule updates, investigation queues measured in days — was built for a different era of threat. It doesn't match the speed, scale, or adaptability of what teams are up against now.

The answer the industry is converging on is agentic AI. But convergence on a category doesn't mean convergence on what it should actually do, how it should be deployed, or what separates a system that works in a regulated environment from one that creates new risks while trying to reduce them.

This piece covers four things: what agentic AI actually means in the context of financial crime, why it improves both detection and operations, why so many early deployments are getting it wrong, and why DataVisor’s conversational AI agent is the new gold standard.

What Agentic AI Actually Means. And, What It Doesn't

The term "agentic AI" is being applied to almost everything right now. Chatbots, copilots, recommendation engines, summarization tools, if it involves a language model or an automation, someone is calling it an agent.

What is AI Chat?

AI chat refers to systems that generate responses to user prompts using natural language processing.

  • Designed for information retrieval, summarization, or content generation
  • Responds to prompts but does not take action independently

Example: Asking a chatbot to summarize suspicious transaction activity or explain a fraud typology.

Key limitation: AI chat can inform decisions, but it cannot execute them.

What are AI Agents?

AI agents are systems that can autonomously take actions to achieve a defined goal.

  • Operate using rules, models, and decision frameworks
  • Can analyze data, make decisions, and trigger workflows
  • Designed to complete tasks, not just provide answers

Example: An AI agent that detects anomalous behavior, adjusts detection thresholds, and flags accounts for review.

Key advantage: AI agents move from insight to action, reducing manual work.

What are Conversational AI Agents?

Conversational AI agents combine the natural language interface of AI chat with the execution capabilities of AI agents.

  • Users interact through plain-language prompts
  • The system can interpret intent and carry out complex tasks

Example:
“Build a rule to detect high-risk ACH credits, test it on last month’s data, and deploy if performance improves.”

The conversational AI agent will:

  • Create the rule
  • Run backtesting
  • Evaluate performance
  • Deploy or recommend changes

Key distinction: Conversational AI agents enable real-time collaboration between humans and AI, where the AI doesn’t just respond—it executes within the workflow.

Type Capabilities Key Limitation / Advantage
AI Chat Obtain valid login credentials Credential stuffing, breach password reuse, phishing, malware
AI Agents Establish control of the policyholder portal Login from new device, unusual IP, new geolocation
Conversational AI Agents Prevent alerts from reaching the real policyholder Change email, phone number, mailing address

Are AI Agents Pre-Built or Custom?

Not all AI agents are created the same. In practice, they fall into two categories:

1. Build-Your-Own Agents
These require teams to design workflows, define logic, and configure how the agent operates.

  • High flexibility
  • Requires time, technical expertise, and ongoing maintenance
  • Often slows down time to value

2. Pre-Built, Domain-Trained Agents
These come ready with embedded logic, workflows, and training for specific use cases.

  • Faster deployment and adoption
  • Designed around real-world fraud and AML workflows
  • Continuously improved based on new patterns and threats

The key difference: whether your team is building the system or using it to solve problems immediately.

Why This Matters for Fraud and AML

For fraud and AML teams, the difference between AI chat, AI agents, and conversational AI agents directly impacts how work gets done every day.

An AI chat tool, including most tools currently marketed as "AI assistants" for fraud and AML, receives a prompt and returns a response. It might summarize a case, explain a rule, draft a narrative, or surface a data point. The output is text or code. What happens next depends entirely on the analyst reading it.

An AI agent does something different. It receives an instruction, determines what steps are required to complete it, executes those steps across your systems, and returns a result. When you ask an agent to build a rule for new users transacting more than $10,000 in ACH volume within a day, it doesn't describe what that rule should look like. It searches your existing feature library for reusable signals, identifies what's missing, builds the missing feature, constructs the rule logic, and presents both for your review before anything touches production.

So here is the distinction that matters:

AI chat answers questions. AI agents take action. Conversation AI Agents do both.

Most financial crime operations are still constrained by manual workflows, fragmented tools, and rules that require constant tuning. Analysts spend hours reviewing alerts, writing queries, testing thresholds, and coordinating across systems just to move a single case forward. Traditional AI chat can help summarize information or surface insights, but it stops short of changing that workflow. It informs the analyst, but the analyst still has to do the work.

AI agents begin to shift that dynamic by automating specific tasks, such as detecting anomalies or triggering alerts. But when those capabilities are disconnected from how teams actually operate, they often introduce another layer of tooling rather than reducing effort.

Conversational AI agents change the model more fundamentally. By combining natural language interaction with the ability to execute tasks, they allow teams to move from intent to outcome in a single step. Instead of navigating multiple systems or manually configuring logic, analysts can describe what they need and have the system carry it through from analysis to testing to deployment within the same workflow.

Why Most Deployments Get It Wrong

Understanding what agentic AI is doesn't automatically translate into deploying it well. The financial crime industry is learning this the hard way.

Most early deployments fail not because the underlying technology is weak, but because the implementation ignores the specific demands of regulated, high-stakes environments. Three problems show up consistently.

The first is giving agents too much latitude. The instinct is understandable — the more context an agent has, the smarter it should be. In practice, broad, open-ended prompts in fincrime environments increase the likelihood that a model fills evidential gaps with plausible-sounding assumptions. An agent asked to "review this customer's entire profile and determine if it's suspicious" is being handed an interpretive task with no clear boundaries. In fraud and AML, that produces confident-sounding narratives that may have no reliable evidentiary basis — and compliance teams end up spending more time validating AI output than they saved on the underlying work.

The second is building for autonomy instead of for trust. There is a version of agentic AI that prioritizes minimal human involvement as a feature — faster throughput, less friction, more automation. In most industries, that's a reasonable design goal. In financial crime, it's a governance problem. Every consequential action — a rule deployed, an investigation closed, a SAR filed — needs a human accountable for it. Systems that route around that accountability don't reduce operational risk. They redistribute it somewhere harder to see.

The third is treating explainability as an output rather than a process. Many platforms generate explanations after a decision has been made — a summary appended to an alert, a rationale attached to a recommendation. That's not the same as a system where the reasoning is visible at each step, where the analyst can see what signals the agent used, what it didn't, and why it reached the conclusion it did. The difference matters when a regulator asks how a case was worked, or when a pattern of decisions needs to be reviewed for consistency.

None of these failure modes require bad technology to occur. They require good technology deployed without sufficient consideration of what financial crime teams actually need from it — which is not just speed or scale, but systems they can stand behind when the decisions get scrutinized.

The Trust Question: Why Most Teams Are Right to Be Skeptical

Skepticism about AI in financial crime is not irrational. It's appropriate.

These are high-stakes environments. A rule that fires incorrectly affects real customers. An investigation that misses a key signal creates regulatory exposure. A SAR narrative that is inaccurate or inconsistent undermines the quality of the filing and the credibility of the program.

Any AI that operates in this context without answering the trust question clearly is a tool teams should think carefully about before adopting.

There are three dimensions where trust either holds or breaks down in agentic AI for financial crime:

1. Explainability — Does the agent show its work?

An agent that takes action without explaining its reasoning is a black box — and black boxes are not appropriate in environments where every decision needs to be defensible.

Trustworthy AI agents don't just act. They explain. When a rule is generated, the logic is surfaced for review — thresholds, conditions, triggering criteria — before anything is deployed. When an alert is summarized, the agent explains why it triggered in plain language, not just that it did. When an investigation checklist is recommended, the reasoning behind each step is visible.

Explainability isn't a feature. In regulated industries, it's a baseline requirement.

2. Human Control — Does the analyst stay in the loop?

Automation without oversight is not appropriate in fraud or AML operations. The analyst is accountable for what happens in their queue. The technology should support that accountability, not route around it.

Trustworthy AI agents build approval into every action. A rule is generated, then reviewed, then deployed — not generated and deployed. An investigation checklist recommends an outcome, but the analyst accepts, modifies, or overrides it. A SAR narrative is drafted, then refined and approved — not auto-submitted.

Human-in-the-loop is not a limitation of agentic AI. It's what makes it safe to use in production environments where the stakes are real.

3. Auditability — Is every action recorded?

Compliance programs live and die on their ability to reconstruct decisions. Who worked the case. What signals were reviewed. What decision was made and why. When it was made and by whom.

Trustworthy AI agents don't just support this — they automate it. Every agent action, every analyst approval, every override and associated note is logged automatically, without the analyst having to document anything separately. The audit trail is complete by default, not constructed after the fact.

For AML programs in particular, this is not a nice-to-have. Regulators expect to see it.

Where AI Agents Have the Most Impact

Across fraud and AML, agentic AI has meaningful impact at five points in the workflow:

Rule and feature creation. Turning a pattern observation into a live production control — without engineering involvement, without deployment cycles. Analysts describe what they see; agents build what's needed.

Testing and optimization. Simulating rule performance against real data before deployment, with agent-recommended threshold adjustments and projected impact shown before any change is applied.

Alert triage. Generating structured context automatically when an alert is opened — triggering logic, key risk indicators, relevant transaction history — so investigators arrive at the case already oriented.

Investigation guidance. Interactive, adaptive checklists that guide analysts step by step through each case, recommend an outcome with reasoning, and log every action automatically for a complete audit record.

SAR narrative generation (AML-specific). Producing regulator-ready SAR narratives drawn directly from case data — transaction details, timelines, key findings — so investigators review and approve rather than write from scratch.

These aren't isolated features. Together, they form a continuous workflow where detection, strategy, investigation, and reporting happen in one place — without handoffs, without delays, and without sacrificing the control that regulated environments require.

Conversational Agents: Where the Interface Meets Execution

The most effective implementation of AI agents in financial crime isn't a separate tool you switch to. It's a conversational layer embedded directly in the platform where work already happens.

Conversational agents, like DataVisor's Vera, combine a natural language interface with the ability to execute across the full workflow. An analyst types what they need, in plain language, and the agent handles the translation into production-ready action.

This matters for a few reasons.

First, it removes the technical barrier between an idea and a live control. An analyst who spots an emerging pattern shouldn't have to file a ticket and wait for engineering. They should be able to describe what they see — unusual velocity across P2P transfers, structuring behavior across branches, a new device fingerprint appearing across flagged accounts — and have a production-ready rule built and ready for review within the same conversation.

Second, it keeps the analyst in the workflow rather than managing around it. The best AI implementations don't create parallel processes — they accelerate the one that already exists. A conversational agent that lives inside the investigation interface means context doesn't get lost, handoffs don't introduce delay, and the gap between insight and action closes to near zero.

Third, it makes the technology accessible across experience levels. A senior analyst and a newer investigator can both interact with the system in the same way — the agent adapts to what's being asked, not who's asking.

Conversational agents are not a simpler version of agentic AI. They're the interface through which agentic AI becomes practical at scale.

The DataVisor Distinction: Moving Beyond "Agent Squads" to Unified Execution

In the current market, "Agentic AI" is often deployed as a collection of fragmented tools—one agent to summarize a case, another to suggest a rule, and a third to draft a narrative. While this looks like progress, it often creates a new "integration tax" where analysts have to manage the hand-offs between different AI bots.

DataVisor’s approach with Vera is fundamentally different. It doesn't just provide a squad of assistants; it provides a singular, unified intelligence that handles the entire lifecycle of a threat.

1. Detection: The Unsupervised ML (UML) Advantage

The most effective defense begins before a loss is ever reported. DataVisor's agents are powered by a patented detection engine that identifies new patterns without labels.

  • Detecting "Unknown-Unknowns": By leveraging Unsupervised Machine Learning, the agent can surface coordinated fraud attacks that have no prior labels or reported losses.
  • Competitive Edge: This differentiates DataVisor from competitors who only automate known logic, allowing for the detection of emerging and synthetic threats.

This foundation allows the agent to surface "unknown-unknowns" — coordinated fraud attacks and synthetic identities that have no prior history or reported losses. By leveraging UML, Vera identifies suspicious clusters in real time, allowing teams to move from reactive defense to proactive neutralisation of emerging and evolving threats.

2. Technical Hygiene: Preventing Redundancy and Debt

As a team moves into the strategy phase, maintaining the health of the detection library is critical. A common pitfall in large organizations is "signal sprawl," where different analysts unknowingly create slightly different versions of the same rule.

  • The "Search-First" Advantage: Before building from scratch, Vera searches existing libraries to see if a similar feature or rule already exists.
  • Prevention of Signal Duplication: By searching first, the agent prevents different analysts from creating redundant versions of the same signal, keeping the system clean.

Vera addresses this through a "Search-First" protocol. Before building a new control from scratch, Vera automatically searches existing feature and rule libraries to identify if similar logic already exists. This prevents redundancy, ensures the system remains clean and auditable, and stops the accumulation of technical debt that typically slows down legacy platforms.

3. Strategy Agility: Automated Feature & Rule Creation

The gap between identifying a new threat and deploying a live control is often measured in days or weeks due to engineering bottlenecks. Vera closes this gap to minutes by allowing analysts to move from intent to execution through plain-language prompts.

  • Instant Feature Building: When a gap is identified, Vera handles the "dirty work" of building missing velocity features—like ACH sum accumulations—directly within the conversation.
  • Rapid Execution: The agent moves from creation to backtesting and technical deployment in minutes, replacing hours of manual engineering work.

During a live encounter, an analyst can describe a new pattern—such as a sudden spike in ACH volume for a new user—and Vera handles the "dirty work" of building the necessary features and logic. The agent automates the creation of complex velocity features, such as transaction sum accumulations, directly within the conversational interface, effectively acting as an on-demand engineering resource.

4. Strategy Optimization: Interactive Rule Tuning

Effective strategy isn't just about building new rules; it’s about optimizing existing ones to reduce friction for legitimate customers. Vera doesn't just suggest generic changes; it provides a collaborative environment for precision-focused optimization.

  • Natural Language Explanations: Vera provides reasoning for why a specific parameter is suggested, ensuring every decision is defensible and understood.
  • Collaborative "Steering": If you dislike the AI's direction, you can provide manual input to steer the optimization, ensuring you remain in control of the strategy.
  • Validated Performance: Vera tests parameters against real data to provide precise metrics—such as precision, recall, and F1 scores—before you commit.

Vera provides clear natural language explanations for why specific parameters are suggested, ensuring every decision is defensible and grounded in data. This is a two-way street: if an analyst dislikes the AI's direction, they can provide manual input to steer the agent. Before any change is committed, Vera tests the parameters against real historical data to provide precise metrics—including precision, recall, and F1 scores—so the impact is known before deployment.

5. Investigation Efficiency: Closing the Feedback Loop

A siloed operation is a slow operation. Vera bridges the gap between the front-line investigation team and the strategy team by creating a continuous feedback loop.

  • Manual Review Integration: Vera can ingest manual review comments from investigators to help refine and tune rules in real-time.
  • Guided Case Summaries: AI-generated case summaries and interactive checklists help reviewers orient themselves and reach decisions significantly faster.

The agent can ingest manual review comments directly from investigators to help refine and tune rules in real-time. This ensures that "on-the-ground" insights are immediately reflected in the broader detection strategy. To further accelerate case work, Vera provides interactive checklists and AI-generated case summaries, helping reviewers orient themselves and reach accurate decisions significantly faster than manual review.

6. Compliance Mastery: Interactive SAR Narrative Generation

The final, most labor-intensive step of the workflow is regulatory reporting. Vera automates this process by transforming case data into regulator-ready SAR narratives.

  • Aggregated Case Intelligence: The agent generates comprehensive SAR narratives by aggregating data across multiple linked cases and suspect profiles.
  • Human-in-the-Loop Refinement: While the agent drafts the regulator-ready narrative, the investigator remains in control, "chatting" with the draft to refine or edit it before submission.

By aggregating intelligence across multiple linked cases and suspect profiles, Vera produces a comprehensive narrative that maintains consistency across the entire investigation. Crucially, the investigator remains the "human in the loop," chatting with the draft to refine, edit, or approve the final document, ensuring the bank stands firmly behind every filing.

7. Deployment: Out-of-the-Box Implementation

The primary barrier to adopting agentic AI is often the perceived technical hurdle of a new integration. Vera removes this objection by design.

  • No Additional Integration: These agentic capabilities are "out-of-the-box" and require no additional technical integrations beyond the standard DataVisor setup.

These capabilities are out-of-the-box and require no additional technical integrations beyond the standard DataVisor setup. The agent works directly with existing real-time data orchestrations and third-party signals, allowing teams to unlock the power of agentic AI without a lengthy implementation cycle.

Learn more about Agentic AI at DataVisor

If you're evaluating where AI agents fit in your fraud or AML program, we've put together two detailed guides — one for each team — that walk through each capability in detail, with concrete examples of how agents work in practice.

And if you want to see the full platform — including Vera, the conversational AI agent built specifically for financial crime — the product page is the right place to start.

DataVisor AI Agents for Fraud & AML

About Fang Yu

Fang spent 8 years at Microsoft Research developing big-data algorithms and systems for identifying various malicious traffic such as worms, spam, bot queries, hijacked accounts, and fraudulent financial transactions across a wide range of Microsoft products.

About Fang Yu

Fang spent 8 years at Microsoft Research developing big-data algorithms and systems for identifying various malicious traffic such as worms, spam, bot queries, hijacked accounts, and fraudulent financial transactions across a wide range of Microsoft products.

Related Content
No items found.

Your Source for Fraud & AML Intelligence

Subscribe for updates on cutting-edge research, industry events, and expert commentary from the leaders in AI-powered financial crime prevention—delivered straight to your inbox..
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.