arrow left facebook twitter linkedin medium menu play circle
December 4, 2018 - Venkata Karthikeya Jangal

Combating the Unknown in Application Fraud with AI

DataVisor Threat Blog

Application Fraud: A Growing Online Problem

According to Aite Group, for 74% of financial institutions (FIs) recently surveyed, digital channel fraud losses have increased over the past two years. Account application fraud and account takeover fraud — both identity crimes — are the top two leading causes of those fraud losses, prompting many FIs to re-evaluate their authentication strategies. Aite also found that retail banks are investing heavily in account opening technology. As fraudsters eye account opening as a way to monetize, risk teams within banks are paying more attention to combating fraud using advanced technologies like AI and Machine Learning as a way to balance growth with operational overhead.

Application Fraud Starts Early

Application fraud typically involves fraudsters applying with synthetic or stolen identities. With vast amounts of personal data available on third party websites, social media as well as hundreds of data breaches, fraudsters create fake applications with a combination of real and fabricated identity information. Identity thefts are common with Social Security Numbers for children since they are rarely tracked by parents. It is attractive for the fraudster to continue the attacks for years since the fraud does not get uncovered until children become adults. According to the “Child Identity Theft” research done by Carnegie Mellon Cylab, 4,311 or 10.2% of the children accounts were impacted by fraud with the youngest victim being five months old.

Application Fraud is Sophisticated

Manual review and approval of new applications within financial institutions is often costly and unreliable. Although a rule-based approach covers majority of the simple fraud, it cannot withstand the changing fraud patterns. When we dive into AI to combat this kind of fraud, most solutions offer supervised machine learning models for fraud detection yet this approach has its own disadvantages such as the model needs to learn with training data set containing previously known and identified fraud patterns. Frequent re-tuning is needed on these models because the coverage drops over time.

Fraudsters today are savvy and have access to advanced technologies – some of the techniques they adopt include but are not limited to IP obfuscation using datacenter IPs, device obfuscation and disposable email addresses. Some fraudsters apply with commonly known accepted income ranges which do not raise any alarms for approval. Unsupervised Machine Learning (UML), which can detect fraudulent behavior based on correlations, can help uncover and adapt to these emerging fraud patterns.

In a well-orchestrated fraud attempt, a single fraudster will try to create multiple fake applications and once approved, will attempt to initiate a series of further applications hoping one (or many) of them will be approved. The number of applications grow rapidly over time from the same attacker and when this type of attack is analyzed as a whole, it exposes some interesting similarities.

Attacks can start from 10s and grow to 1000s of applications depending on the size of the FI. Sometimes, identifying smaller groups of fraudulent accounts is very important too because of the high credit limit and losses associated with them. Most often, significant time is spent by manual review teams to review the applications and eventually they get approved. Even if a fraudulent attack is identified after the first few days, the damage is already done. This is where DataVisor’s Unsupervised Machine Learning model plays a crucial role because as the coordinated fraudulent applications keep coming, they are placed in the same suspicious groups in real time and the FI can take action in real time.

A successful application fraud attack involving thousands of synthetic and fake identities.

Deconstructing an Attack

Let us look at the typical behavior of these type of attacks. The fraudster uses various techniques to bypass the existing fraud detections within the financial institution. They generally obtain personal information such as names, addresses, phone numbers from third party background reporting websites and combine them with fabricated emails and open fraudulent accounts. Based on insights from our Global Intelligence Network, we notice that a lot of emails originate from Chinese domains. The emails used in the applications follow a recurring pattern where the name is followed by four digits, which has no similarity to the first and last names of the applying individual. However, other details like addresses and phone numbers match with the real identity of that individual. Sometimes, all of these applications come from a single datacenter IP range.

Figure 1: Sample application records for the above fraud group show diverse Emails, Names but have some fraudulent pattern associated with it. Another pattern commonly observed is that attacks are executed at multiple intervals. The attackers keep changing details in the applications, such as the physical address, IP address, devices etc. to bypass the internal systems over a period and often end up being successful.

Some attackers use loopholes in the system to bypass the internal rules engine. Each application is associated with a unique email address, though all of them get directed to the same email account.
For example:

  • tom.hanks@gmail.com
  • to.mhanks@gmail.com
  • tomh.anks@gmail.com

All the above three emails would redirect to tomhanks@gmail.com because the character ‘.’ is ignored in Gmail address. Sometimes applications come in bulk from disposable emails like mailin8r.com. Adopting with these well-curated techniques, attackers send a lot of applications hoping a few of them would slip through the scrutiny. They also tend to move away from the disposable emails and use gmail as well as yahoo to submit fake applications as well. It is very hard to detect several attack patterns which are very different from each other. Overall approval rates go down and time taken to approve good applications increase, leading to brand reputation damage.

Making the Process More Efficient

By using advanced technologies like machine learning, risk teams can reap huge operational benefits. When applications within the same fraudulent group are manually rejected while some others are approved, there is a significant time invested in manually rejecting these applications as well. Machine learning tools can help block two-thirds of these approved applications in real-time, saving resource time and overhead.

An efficient account opening process can help deliver a better customer experience while reducing the overhead associated with keeping bad actors out of the system. DataVisor Fraud Platform, powered by Unsupervised Machine Learning, can provide significant lift in detecting coordinated activity associated with account opening fraud.

about Venkata Karthikeya Jangal
Venkata is a Technical Account Manager at DataVisor. He is responsible for customer success and is a technical advisor for DataVisor customers. Previously, he worked in UEBA space and used SIEM tools extensively. Venkata is passionate about using anti-fraud solutions and cyber security analytics to combat threats.
about Venkata Karthikeya Jangal
Venkata is a Technical Account Manager at DataVisor. He is responsible for customer success and is a technical advisor for DataVisor customers. Previously, he worked in UEBA space and used SIEM tools extensively. Venkata is passionate about using anti-fraud solutions and cyber security analytics to combat threats.