DataVisor Blog: Quick Takes​

The Commoditization of Supervised Machine Learning


The Commoditization of Supervised Machine Learning

Until a few years ago, machine learning (ML) was a topic of discussion for a few members of the academia. Today, this emerging technology is no longer restricted to university labs and corporate research centers have access to massive training data and computing infrastructure. Developers are getting to a point where they won’t need to be proficient in ML to take advantage of its power. Taking a step further, supervised machine learning tools are getting commoditized with many providers enabling access to them. In this plethora of tools, some provide better usage than others, while some have newer functionality. There are multiple ways to access supervised machine learning tools – it could be through the traditional route such as MATLAB or via open-source channels such as Azure Machine Learning or Spark Machine Learning. Or you could use hosted versions such as Amazon SageMaker.

However, many of these offerings are tools or computing elements that simply enable access to machine learning. They are not equivalent to having an end solution. Making this distinction is very important since the customer is looking for a solution that works to address the business challenges. Simply having access to various data frameworks or algorithm frameworks or computing frameworks is not equivalent to having a solution that a customer can turn on day one to solve their business problem. For example, if there is a lighting issue, one can have different components such as a bulb, a switch, a wire, a socket et al in place, however these put together cannot make the electricity flow. You need a complete infrastructure such as a power plant, a transmitter, wire hook up and an electrician who has the expertise to make this work so that when you switch on the light, it works. In the same way, tools or ML frameworks are leveraged to build systems, and while the SML algorithms are being commoditized, the real value lies in how the team with the domain knowledge can build the vertical solution that is up and ready to deliver high-performance quickly.

Delivering Business Value with Unsupervised Machine Learning

DataVisor provides a unique value proposition here to address fraud prevention. It is distinctive in two aspects. Firstly, it is a vertical situated on top of the computing elements and provides a solution to solve the customer’s business problem — it is not just a technology stack that works together. Secondly, it provides unsupervised machine learning whose availability and demonstration hasn’t been seen much in the market yet, yet is extremely powerful in the domain of cyber fraud.

Yinglian Xie, CEO of DataVisor says,“Ultimately, when people get more familiar with supervised machine learning, they will see that it is not a panacea that solves everything. There are limitations to what it can and cannot do — hence the need to get machines to come up with intelligence beyond what people could guide.”

At DataVisor, we understand that using ML on its own is great for anomaly detection. We also have domain expertise in fraud, and know that anomalies do not always equate into fraud. This is why our solution using the DataVisor Unsupervised Machine Learning Engine has been wrapped around specific use cases which include Financial Fraud, User Acquisition Fraud and Social/eCommerce Fraud. These have been further broken down into very niche situations. For example, key markets within Financial Fraud include but are not limited to Transaction Fraud, Application fraud, as well as Account Takeovers. Having this domain knowledge weaved into the solution allows our customers to turn on the system and reap benefits right from day one.

When a machine learning model is trained based on historical cases, it is always bound to the data defined in those cases. However, fraud is a moving target — and with the enemy shifting patterns continuously — risk teams need to be vigilant constantly to retune models continuously to tackle new and unknown fraud patterns. From a long term perspective, exploring technologies such as unsupervised machine learning is becoming critical to protect assets.