arrow left facebook twitter linkedin medium menu play circle
August 30, 2023 - Jeremy Chen

5 Things Every Fraud Leader Should Know About Machine Learning

The battle against fraud is more critical today than ever. Thanks to advancements in generative AI, fraudsters’ methods are rapidly evolving and becoming more dangerous. Machine Learning (ML) has emerged as the most powerful tool for fraud teams to fight back, offering fraud leaders unprecedented insights and capabilities.

But, harnessing the full potential of ML requires more than just a passing familiarity with the technology. Whether you’re a seasoned fraud prevention pro or a newly minted fraud team leader, the five aspects of ML I’ll discuss in this post are required knowledge if you want to succeed.

Here are five things that as a fraud leader you absolutely need to know about ML.

1. How to interpret what the model is saying

Before you can do anything with ML tools, you and your team need to know how to properly interpret what your model is telling you. Explainability is a key part of the machine learning process in any field, especially fraud prevention.

Interpretation depends mainly on the type of model you’re working with—supervised or unsupervised. In a fraud prevention setting, both generate fraud scores that are normalized to help fraud leaders establish downstream rules. For typical risk scoring, this means a binary system of 0 for non-fraudulent events and 1 for fraudulent events. Where things begin to differ is in how each model type makes scoring decisions.

For supervised machine learning models, let’s use XGboost as an example. A higher score suggests that a transaction closely resembles known fraudulent behaviors from the past. That’s because supervised models learn from manual inputs of known fraud attacks.

For unsupervised machine learning, we’ll use DataVisor’s model as an example. In this type of model, transactions receive higher scores when the model correlates them with a cluster of other suspicious activities or entities.

So if you have a transaction that surpasses the 0.6 threshold for fraud score, you need to combine that score with the scores from all models you’re using to construct a comprehensive picture of the customer’s behavior.

Making decisions based on machine learning requires trust. Interpretability and explainability are both hugely important in building that trust. That leads us to the next crucial aspect of ML to know…

2. Black box vs. white box ML models

All ML models vary in complexity, capability, and accuracy. Whether we can see what’s going on in the algorithm or not determines the levels of each, at least in part.

White box AI models are transparent about how they come to their conclusions. The biggest benefit is this makes their results more explainable. That’s a plus for both stakeholders and those affected by the model’s predictions or decisions. In the case of fraud prevention, knowing why a model flags certain users’ transactions for fraud or denies a specific person a credit card is vitally important. Not only because consumers want to know, but because these determinations could be made on logic that breaks discrimination laws.

Black box models, in contrast, produce conclusions based on a data set fed into it by the user, but the user can’t see how the model reached that conclusion. While these models tend to be significantly more accurate, using them becomes more complex because stakeholders can’t explain their decisions as clearly.

The biggest issue with black box models, as you might guess, is lack of trust. When generative AI is in charge of crucial decisions that affect people’s money, healthcare, and overall well-being, explainability is critical.

As DataVisor CEO Yinglian Xie points out, a UML model can actually have full explainability. Thanks to tools like linkage analysis, clustering, and transparent decisioning, white-box models can reach accuracy and detection levels we previously assumed only black boxes could reach.

3. How to assess and improve your ML model and fix breaks

Fraudsters constantly evolve and improve on their already sophisticated tactics. To keep pace, you need to constantly monitor the performance of your ML model and fix any breaks you find.

There are a few different ways to do this. To start, you should always be evaluating both how your model performs by itself and how your combined fraud strategy works as a whole. All financial institutions have a different risk allowance, so compare your model against your own unique threat vectors.

To assess your model’s performance, you’ll want to key in on several important metrics. These include:

  • Fraud detection rate to ensure high coverage
  • False positive rate to reduce customer friction
  • Detection speed to spot threats early and protect real-time services
  • Your model’s ability to generate responses within an acceptable SLA (i.e. within 100ms for instant transactions)
  • KS score to check that your model can reliably differentiate good from bad
  • Checking for population shifts—your model may not change but user behavior changes and you must adapt to new fraud threats this brings (new locations, new schemes, etc.)

In terms of fixing breaks, you’ll need to set up alerts so you can address them immediately. Look at your thresholds and ensure these breaks aren’t pushing you over them. When you get alerts, you ideally should have an automated way to retrain your models.

In DataVisor’s case, our UML doesn’t require frequent tuning or feedback labels to develop the model. This gives it the longevity to retrain to new attributes and types and helps improve detection performance lift.

4. How to make decisions based on ML data

Next to interpretability and explainability, decision-making is among the most important things to focus on with any ML model. For fraud prevention especially, decision-making affects your customers’ experience and your organization’s bottom line.

Decisions made using an ML model are either automated or manual. Ideally, you want to automate as many decisions as possible. That means having an accurate ML model that uses your business rules and other decisioning technologies to create a streamlined decision workflow.

To create the most efficient decisioning process, here are a few things you’ll need:

  1. Different decision strategies for different event types. FedNow and Zelle, ACH transfers, card payments, and more all see different fraud attacks.
  2. Additional enrichment data from third parties to make sound, well-rounded decisions. For example, you can stop ATO attacks with third-party information on passwords and emails that were part of a data breach.
  3. A well-governed decision platform to make it easy for business analysts to create, validate, attune, and deploy business rules. Using a no-code platform to support this adds speed and ease of maintenance.
  4. Different decision actions for rule writers. They should be able to approve, reject, and make suggestions upon review. Make this a centralized place to create business rules and use risk scores in decisions.
  5. Automated rule tuning that allows analysts to adopt or reject changes. DataVisor is the only platform that supports this feature.
  6. To give case investigators as much information as possible without adding distracting data. Your model needs to show real red flags and give clear reasons for them. It should also give investigators the ability to visualize linkages between entities to spot fraud rings. When reviewers make manual decisions, those need to be fed back to your ML model right away so it can train and enhance.
  7. To use generative AI to help streamline and automate manual decisioning. On the manual side, generative AI provides assistance to fraud analysts. On the automated side, it speeds up the process and takes over extra work.

5. The impact of real-time ML on fraud detection

Customers today expect real-time payments, transactions, and applications. These are no longer perks to win customers, they’re requirements to retain them. With services like FedNow arriving to provide these nationally, if you don’t do real-time then you won’t be around long.

Of course, if you’re providing real-time payment capabilities then you also need real-time monitoring. The challenge financial institutions find themselves facing most here is balancing instant services and a good customer experience with strong fraud prevention. That means you need a system that lets you detect fraud and make decisions in real time.

To truly enable this for your fraud team, you’ll need three things. First is data integration—meaning you need to combine events from different sources to make decisions in real time.

Next, your system must be capable of scoring transactions through your ML model instantly. Legacy systems struggle with this speed of scoring and a large number of real-time transactions. Fixing them and adding real-time monitoring is also often costly and laborious, so you may need to explore a real-time-ready solution.

Lastly, you need to be able to feed information back into your system in real time so you always know the performance of your fraud strategies. This instant feedback allows you to create new strategies, improve old ones, and adapt fast enough to stay ahead of fraudsters. Legacy systems also struggle with adapting to this speed, which makes it all the more important to audit your model’s capabilities.

Do you trust your model to detect fraud in real-time accurately for all your use cases? Our experts are happy to chat and assess where your ML could improve, plus tell you how it stacks up against the best ML models on the market.

about Jeremy Chen
Jeremy is Senior Director of Product Management at DataVisor.