Big Data Analytics: The Future of Threat Detection

Today’s more sophisticated threat landscape is posing a big challenge for modern online services. It is increasingly difficult for traditional fraud solutions to thwart new unknown types of attacks conducted by large distributed crime rings.

DataVisor’s next-generation unsupervised machine learning approach combined with the scalability of our Big Data architecture is transforming the way threat detection is done. Gone are the days when multiple point solutions are needed in combination to tackle the increasing fraud challenge on your platform. Big data parallel processing enables not only unprecedented scalability but also allows for the use of more advanced algorithms to process trillions of user actions all together. Empowered by DataVisor’s technology, you can catch new, changing attacks automatically and before any damage is conducted.

A Full Stack Analytics Platform for Protecting your Online Service

The DataVisor User Analytics PlatformTM is an end-to-end solution with the most comprehensive technologies for detecting individual and widely distributed attacks across a variety of use cases – including fraud, abuse, money laundering and more. Through analyzing billions of pieces of user attribute data and transaction data, our platform lets you know everything about your users, good and bad, from cradle to grave.

Anchored by the patent-pending Unsupervised Machine Learning Engine, DataVisor’s platform includes a suite of additional detection analytics including Supervised Machine Learning, Automated Rules Engine and the Global Intelligence Network. The detection results from the Unsupervised engine serve to make each other analytic more effective by updating them with real time signals, helping you have the highest detection results, the most current rules and the most comprehensive forensic information about adversaries on your platform.

DataVisor Unsupervised Machine Learning (UML) Engine

We correlate user and event attributes across all users to identify unknown threats by linking bad actors together into malicious campaigns. By looking at all event types at all times, our unsupervised algorithm also detects networks of malicious accounts early before they conduct any damage.

DataVisor Supervised Machine Learning (SML) Engine

Using detection results discovered by the UML engine as well as client-provided labels, DataVisor’s SML Engine provides superior detection of known attack types from non-coordinated attackers.

DataVisor Automated Rules Engine

We generate rules automatically on a daily basis from attributes provided by unsupervised machine learning to reduce manual rule tuning time and provide human-understandable definitions to better understand our machine-learning detection results.

Learn More
DataVisor Global Intelligence Network

We aggregate and anonymize from over 1.3 billion users across a variety of verticals the broadest array of telemetry signals, such as IP addresses, user agent strings, email domains, device types, etc.

Unsupervised Machine Learning Approach

At the heart of DataVisor’s technology is our patent-pending Unsupervised Machine Learning Engine (UML Engine). The way DataVisor’s UML Engine works is similar to taking a panoramic view of a Pointillist painting. When viewing any individual dot within the painting up close, all of these dots appear to be indistinguishable from one another. However, if you step back and take a panoramic view of the entire painting, patterns begin to emerge. By taking a global view of all users within an online service, our unsupervised detection algorithm is able to find clusters of bad actors acting in a correlated fashion, without requiring training data or labels.

This unsupervised machine learning approach has a variety of unique benefits:
Proactive Threat Detection

Catch unknown threats without any training data or labels before any damage is done

Campaign Detection

Able to catch entire crime network by correlate groups of bad actors with similar attributes

Data Agnostic

Utilize flexible data model to support a wide variety of industry use cases

Massively Scalable

Handle billions of event data, not just transaction data, from the largest internet properties in the world

How It Works

The way our technology works is through first ingesting user profile and event data from our customers. We then utilize our in-house domain expertise to parse data and complete feature engineering to prepare for our User Analytics Platform. The platform consists of four components including our Unsupervised Machine Learning Engine, Supervised Machine Learning Engine, Automated Rules Engine, and DataVisor Global Intelligence Network. Our detection results can then be accessed via our user interface, the DataVisor User Analytics Console, or delivered via the DataVisor Results API in batch or real-time. Depending on your business needs, our platform has flexible deployment options including on-premise, SaaS, and private cloud deployment.

Learn About Our Technology From Our Founders


Leveraging Spark to Analyze Billions of User Actions to Reveal Hidden Fraudsters

Yinglian Xie, co-founder and CEO of DataVisor, spoke at the San Jose Strata+Hadoop World Conference about how DataVisor leverages Big Data infrastructure to analyze billions of user accounts and trillions of events to reveal hidden fraudsters.

Watch Video


Fang Yu On Machine Learning And The Evolving Nature Of Fraud

In this episode, O’Reilly’s Jenn Webb talks with Fang Yu, cofounder and CTO of DataVisor. They discuss sniffing out fraudulent sleeper cells, incubation in money transfer fraud, and adopting a more proactive stance against fraud.

Listen to Podcast

Getting Started

Want to be a part of the exciting journey we are undertaking to transform how cybersecurity is done? Request a trial today.

Request Trial