While organizations actively invest in designing better digital experiences to attract and retain customers, the simplicity and ease of use of online channels have opened up new ways for fraudsters to commit fraud. And it’s no surprise that fraudsters are getting more creative and have access to advanced technologies to execute sophisticated fraud attacks. As attacks grow in scale and velocity, businesses are forced to evolve their fraud detection methods from manual detection involving blacklists and rule engines to machine learning algorithms that can detect known and emerging types of fraud. Adoption of unsupervised machine learning that can provide early detection of unknown fraud has been growing steadily. According to Gartner, 50% of companies will use unsupervised machine learning by 2021. So why is unsupervised machine learning becoming a sought-after technology for fraud detection? This article highlights why existing fraud detection methods have limitations and more importantly a few reasons why unsupervised machine learning is gaining traction.
The simplest fraud detection method is blacklisting, which essentially acts as a filter. Although blacklists are easy to implement, they are also the most error-prone and are slow to react to new fraud attacks. An example of a blacklist for banks is using FICO scores to determine the risk levels of credit card applicants.
The advantages and disadvantages of blacklist are obvious. The advantage is that it is simple, convenient, and can be applied to many scenarios. The downside is that it cannot cope with emerging fraud patterns.
2. Rules Engines
The upgraded version of a blacklist is a rules engine. Rule engines can often be used with blacklists, and fraudsters caught by rule engines are blacklisted. It’s easier to understand how a rules engine works by walking through an example:
- Users who have returned goods 5 times in a row cannot buy purchase protection;
- If a user’s return ratio exceeds 80%, they cannot buy purchase protection;
Consider a set of registration events from a group of users. Through clustering, several small groups have correlated activities such as having similar attributes such as registration time, operating system, browser version, etc.. When analyzed one at a time, the users appear to be normal, but interestingly, their registrations are suspiciously consistent. For example, a group of people signs up for the same product using Google Chrome between 2 am and 3 am with GPS locations within a mile of each other, followed by both nickname and gender modifications after registration. If only one user’s registration had these details, there would be no problem. However, it is abnormal for a group of people to have such similar registration pattern.
Unsupervised machine learning can also be used to identify spam emails. One method to identify spam may be to analyze the types of emails that users delete. Another common method to determine the type of spam is by analyzing the reply rate.
As shown in the figure, the lower left corner shows a naive attacker who simply sends a lot of spam emails, and the reply rate is close to 0. A fraud detection system can easily categorize this as spam. The group in the lower right corner is much smarter. They increase their response rate by sending emails to and from each other. These accounts usually add friends and exchange emails to disguise themselves as normal users. Unsupervised learning can discover this type of coordinated activity and reveal fraudulent behavior.
Unsupervised algorithms applied to fraud detection usually have the advantage of early warning. Today’s fraudulent users often have an incubation period before conducting any fraud to avoid detection. But because their behavior during the incubation period conforms to certain standard process and has consistency, they can still be captured by unsupervised algorithms. The detection of suspicious behaviors before an attack occurs is hardly possible with other methods. This is one of the most important reasons why unsupervised machine learning is starting to play a bigger role in fraud detection.
To learn more about how unsupervised machine learning works, visit http://datavisor.com/platform/