Keith Furst is the Founder of Data Derivatives, and has years of experience within a variety of financial institutions including Tier One wholesale banks, investment banks, foreign bank branches, commercial banks, retail banks, broker-dealers, prepaid card providers and merchant acquirers with a focus on implementing, fine tuning and validating financial crime systems. His forte relates to transaction monitoring, customer due diligence, fraud and market abuse systems and his work included custom data analytics resulting in the identification of suspicious activity outside of the traditional surveillance models.
Based on a report issued by PricewaterhouseCoopers (PwC), 90 to 95 percent of all alerts generated by transaction monitoring systems (TMS) are false positives. Not only does this translate into operational overhead, it may also lead to missing real alerts hiding under the mountain of false positive alerts. This is not news to those in the anti-money laundering (AML) space. I often hear the same complaint from my colleagues who implement TMS at other financial institutions. It’s not uncommon to hear:
“Our TMS was generating a few hundred alerts every month, but after we went through the upgrade, it’s generating thousands!”
The problem with false positive alerts is that it creates huge operational overhead that translates into absolutely zero substantive suspicious activity report (SAR) filings. At a certain point, there are diminishing returns for alerts generated. A bank can only investigate so many alerts and still conduct effective investigations.
Not only do these problems generate massive amounts of false positive alerts, they make it easy for criminals to get around existing TMS. (A topic we will explore in a future post.)
Why are there so many false positive alerts?
There is a fundamental technical barrier to traditional TMS that leads to a flood of false positive alerts. The TMS rely on rules or simple models which have a myopic view of global trade, human behavior, complexity of transactional networks and hidden links between nefarious actors. They have a very simplistic view of the activity being monitored by only distilling it down into only a few dimensions for the rule to interrogate.
Here are the two fundamental problems of existing TMS:
- Coarse-grained rules that result in detecting many scenarios, most of which are not actually suspicious
- Using only a subset of event types and data available, which limits the number of signals they can use for detection
For example, by looking at all the information available in the following diagram, it’s clear that in these ten transactions, only one is suspicious:
However, existing TMS only look at a subset of the data available to it. One rule in the TMS may be to flag all transactions as suspicious within a specific timeframe if they’re between $9,500 and $9,999. In this case, they all look the same, so all ten of these transactions are flagged as suspicious. This is a 90% false positive rate.
Is it possible for existing TMS to make their rules less coarse-grained? No, because TMS only look at a subset of event types, so if the rules are designed to be more specific, then they will miss real suspicious activity. Casting a wide net means that they will be able to detect some suspicious accounts, but will also result in alerts on a lot more good accounts. It’s not enough to simply tweak the existing rules or simple models. Rather, it’s necessary to look to a new technical solution to address the false positive alerts plague.
The promise of unsupervised machine learning
Unsupervised machine learning (UML), if implemented properly, can solve these problems for AML teams. UML can be leveraged to reduce false positives by looking at all activity within your financial institution from a global view and linking common bad actors together. This drastically reduces false positive alerts without compromising on compliance with regulatory guidelines.
To see how this is possible, it’s important to understand the technology. UML is a category of machine learning that can detect hidden patterns in large data sets, such as fraudulent user accounts, without prior knowledge of what a fraudulent account looks like. This is different from supervised machine learning, which requires knowledge of previous patterns to catch similar ones in the future. In the context of AML, UML automatically finds these hidden patterns to link seemingly unrelated accounts and customers. These links can be one of thousands of data fields that the UML model ingests. The below image depicts customers detected by UML because they are linked due to shared attributes such as an email address, physical address, phone number, internet protocol (IP) address and a common beneficiary.
So, in contrast to using coarse-grained rules, UML considers thousands of data fields to detect complex networks. This allows UML to look at a vast array of attributes and sift the real signal (suspicious activity) from the noise. Furthermore, UML can ingest all event data, which enables it to determine if accounts have similar suspicious related activity. For example, UML can link accounts together that have similar high transaction volume with low dollar amounts in the same time window—without being programmed to look for this specific case.
UML also decreases the prevalence of false positive alerts because it can catch a group of related accounts. So, it has more confidence that these accounts are bad. Think about it – If you saw one account do something weird, you might be unsure if it’s bad, but if you see fifty accounts linked together doing similar suspicious activity, you become extremely confident that they’re all bad. UML is better at differentiating between good and bad activity, and when an alert is generated, you can be much more confident that it’s a real alert.
What this means for compliance departments
With rapidly growing compliance department costs and no decrease to regulatory fines in sight, it’s becoming increasingly clear that we need a new approach to TMS. UML can reduce compliance costs by lowering false positive alerts and reprioritizing time spent on investigations. At the same time, it can increase the quality of suspicious activity report (SAR) filings. As the number of alerts to investigate decreases, existing compliance resources can be reallocated to other important activities such as quality control, analyst training and risk assessments.
From a practical perspective, the transition from traditional TMS to one that uses UML does not have to happen overnight. Instead, using UML alongside another TMS can be a great place to start. This can be an easy and more gradual solution, which I’d recommend when implementing any new TMS, unsupervised or not.
Ultimately, financial institutions have embraced UML in other areas of banking such as fraud, credit risk and trading, so it is only a matter of time before compliance departments do the same. It’s now a question of when and which institutions will lead the pack out of traditional TMS to raise the stakes in the fight against money laundering.