arrow left facebook twitter linkedin medium menu play circle
May 21, 2019 - Sean McDermott

Defeating Mass Registration with Unsupervised Machine Learning

DataVisor Threat Blog

Advanced algorithms, big data analytics, and real-time detection capabilities are empowering marketplaces and social platforms to detect fraud holistically, analyze data contextually, and act decisively.


Mass registration is a type of fraud attack that is so pervasive it might very well be considered the foundation of modern fraud as we know it. In our digital economy, with the volume of data we produce, everything is about scale. Fraud is no exception, and mass registration has helped make it possible for fraud to scale. DataVisor Co-Founder and CTO Fang Yu spoke about this on the O’Reilly Security Podcast back in 2017:

“Today’s attackers are not using single accounts to conduct fraud; if they have a single account, the fraud they can conduct is very limited. What they usually do is construct an army of fraud accounts and then orchestrate either mass registration or account takeovers. Each of the individual accounts will then conduct small-scale fraud. They can do spamming, phishing, and all different types of malicious activity. But because they use many coordinated individual accounts, the attacks are massive in scale.”

Fang Yu’s insights remain relevant today, even as techniques have advanced, and new technologies have emerged.

One of the most significant recent trends we’ve observed has been the rapid rise in the use of bots to power fraud attacks. Bot armies have been deployed across industries, and among those hardest hit are marketplaces and social platforms.

Mass Registration Fraud: Marketplaces

Successful marketplaces are all about trust. Legitimate sellers with actual products need to trust that they’re dealing with genuine buyers prepared to spend real money. Legitimate buyers ready to spend their money need to know they’re dealing with authentic buyers honestly and accurately representing real products. Mass registration fraud sabotages trust in devastating ways, and the increasing prevalence of bot-powered mass registration attacks has raised the issue to crisis levels. Marketplaces cannot survive when it’s no longer possible to accurately distinguish the good accounts from the bad ones.

Client Success in Defeating Marketplace Fraud

DataVisor works with enterprise clients across industries to detect and defeat mass registration fraud. One such client—a global online marketplace operating in 40+ countries, with over 350 million monthly active users—was enduring an onslaught of mass-registered accounts that were threatening the long-term success of the platform by significantly degrading customer experience, tarnishing brand reputation, and draining finances. What made these attacks particularly challenging to defend against was their combination of speed and patience. 60% of fraudulent accounts were being used in attacks within two hours of having been created, while other accounts were remaining dormant for weeks after registration before being deployed. Choosing to partner with DataVisor on a new approach had an immediate and highly positive impact. Overall fraud detection rates rose by more than 20%, and nearly 90% of fraudulent attacks were detected before a first attack was ever committed.

While marketplaces are bombarded continuously by mass registration attacks, they are not alone in this regard—social platforms are also appealing targets for fraudsters.

Mass Registration: Social Platforms

Social platforms have long been highly visible victims of mass registration. One of the most notable use cases came to light in July of 2018, when the Washington Post broke the news that Twitter was purging one million accounts a day in an effort to root out fraudulent and malicious registrations. A year prior, the world learned of Russia’s election meddling activities, and their use of fake accounts to influence voter opinions. This, of course, remains an ongoing issue—proof that mass registration fraud on social platforms has severe and long-lasting consequences. Fortunately, if falling victim to mass registration fraud has sweeping negative consequences, the opposite is also true. By April of 2019, Twitter’s efforts to rid itself of bots and fake accounts were paying off handsomely—an only-so-hyperbolic headline from the NY Post declared: “Twitter shares surge as company purges fake and abusive accounts.”

Client Success in Defeating Social Platform Fraud

Another DataVisor client, a leading global social commerce platform with over 250 million monthly active users, was suffering reputation damage and increasing rates of customer churn due to aggressive spam attacks that relied on scripted bots to automate high-frequency events. By adopting DataVisor’s proactive, AI-powered approach, our client was able to capture 99% of all spammers, with more than 80% of those caught at the point of sign-up.

Mass Registration Fraud: Tools and Techniques

Fraudsters use a variety of techniques to create and manipulate accounts. These techniques include:

Fake User Activity
Attackers simulate user activity by uploading stolen photos and content from other sites, making them appear real even to human reviewers.

Device Obfuscation
Fraudsters utilize mobile device flashing, virtual machines, and bot scripts to make it appear as though login events are coming from different devices.

Stolen Identities
Malicious actors use readily-available stolen credentials, or information from data breaches, to create authentic-looking new accounts.

IP Obfuscation
Attackers use proxies, VPNs, and cloud-hosting services to evade IP or location blacklists and digital-fingerprint solutions.

The Failure of Legacy Fraud Solutions

Although the rise of bots is a comparatively recent trend, tools and techniques like those described above have been in use by fraudsters for some time, and companies have historically relied on an array of fraud prevention solutions to deter their activities with varying degrees of success. Today, most of these legacy solutions are impotent, outmoded, and irrelevant. Consider the example of IP Obfuscation.

IP reputation services have long been in use by organizations to help identify and block malicious actors, but these services are inherently reactive—they make their determinations largely by whether a given IP address has exhibited malicious activity in the past. But IP addresses generated by fraudsters generally have no history. Fraudsters can rely on residential virtual private networks (VPNs) to mass-create anonymized IP addresses in short periods of time, which they then use to mimic authentic residential IP traffic (such as hopping from IP to IP—behavior normally signals a trustworthy user). Fraudsters cycle through these residential IP addresses at rates too rapid for blacklists to keep up. Overall, DataVisor’s research finds that 65% of the IP addresses generated by fraudsters are used for seven days or less.

The above is but one example of legacy fraud solution shortcomings. The reality is that the vast majority of legacy solutions have been outmaneuvered by modern fraudsters. More importantly, the bot age has effectively out-scaled the existing generation of fraud solutions. As reported by ZDnet in April of 2019, the latest report from Distil Networks states that bad bots now make up 20 percent of all web traffic.

Meeting Scale with Scale using AI and Unsupervised Machine Learning

Artificial intelligence and machine learning offer a new way forward, by giving organizations the ability to meet scale with scale, to be proactive, and to prevent damage before it happens. Advanced machine learning models can be deployed to analyze user histories, behavior changes, and suspicious patterns across millions of accounts, enabling organizations to detect—and take action against—sophisticated, large-scale, coordinated attacks. Even cleverly and patiently incubated accounts are surface-able when flagged and analyzed contextually and holistically. The approach is not unlike weather forecasting—intelligent systems can detect seemingly disparate and disconnected events and ascertain that they comprise the ingredients of a brewing storm. These signals go unnoticed when viewed in isolation, but when considered in context, it becomes clear they are part of a larger pattern.

Promotional abuse, phishing, spamming, fake reviews, illegal commerce—these are just a few of the ways fraudulently mass registered accounts destroy trust, destabilize platforms, and subvert customer experience. DataVisor’s approach—hallmarks of which include the sophisticated use of AI and machine learning, a holistic approach to data analysis, and proactive solutions to known and unknown challenges—represents a future-facing answer to the contemporary challenges of mass registration fraud across industries.

If fraud continues to scale, our ability to proactively defeat it at every turn will scale right along with it.

about Sean McDermott
Sean leads the social commerce sales efforts at Datavisor, a cutting-edge fraud detection platform based on AI and machine learning. As a sales manager with over 12 years of experience in sales, Sean is adept at selling complex and sophisticated technologies. In the last 5 years, he has focused on selling Enterprise AI Fraud Products, quickly gaining expertise in different machine learning techniques including supervised, and unsupervised for the data science buyers.
about Sean McDermott
Sean leads the social commerce sales efforts at Datavisor, a cutting-edge fraud detection platform based on AI and machine learning. As a sales manager with over 12 years of experience in sales, Sean is adept at selling complex and sophisticated technologies. In the last 5 years, he has focused on selling Enterprise AI Fraud Products, quickly gaining expertise in different machine learning techniques including supervised, and unsupervised for the data science buyers.