Machine Learning Methods

Discussing three broad categories of ML algorithms: supervised learning, unsupervised learning and hybrid

  • Nexus Cognitive Chief Executive Officer, Anu Jain
    Anu Jain

I've talked about machine learning (ML) and how it can help you get more out of your analytics and data integration efforts. Machine learning, and the people who analyze the data it produces, can significantly improve your analytics outcomes. However, it's critical to understand which types of algorithms will help you get the insights you seek from your data---and which will give you insights you didn't even know you wanted.

I'm going to discuss three very broad categories of ML algorithms: supervised learning, unsupervised learning, and hybrid models that combine elements of the other two. Depending on your goals, each algorithm type has its place.

Supervised Learning

Supervised learning is the workhorse of ML. It involves training your machine with paired data---a series of inputs where the output is known. You feed the machine enough of these data pairs and it learns which data go together. For example, if you feed the machine information on the stock market, along with date and economic information, you can construct a relatively accurate predictive model. Of course, it won't be 100% accurate (humans run the stock market, so there's irrationality, and therefore unpredictability) but with enough time and data, it'll get really good at predicting the Dow Jones Average over time.

You can also build supervised learning models that classify things. For example, researchers can feed the machine population and epidemiological data and build a model of people who are likely to get cancer, heart disease, or diabetes. You can also build predictive models of customer segments that are likely to churn, demand forecasts, project outcomes, financial performance---the list goes on and on. The upshot is that if you can more accurately predict events or behaviors, you can devise and implement strategies to plan for, and capitalize on, them.

Unsupervised Learning

Unsupervised learning is the powerful wildcard of ML. Its power is unfortunately sometimes hindered by its unpredictability and difficulty to use effectively. With unsupervised learning, the inputs are known, but the predicted outputs aren't. The machine learns by trial and error. Inputs and outputs are paired by experience. Given enough data and time, the algorithm will show you patterns in the data that you would never discover using supervised methods.

However, because the outputs aren't known in advance, there's often difficult to know whether the results of the model are valid. Clustering, the most common technique used for unsupervised learning, involves grouping set members with common traits together. For example, you can segment customers with similar buying habits or demographics. The difficulty lies in knowing if these groups provide useful insights, how many of them should exist, or whether they're even grouped correctly. You can refine the model over time, but there's always a level of uncertainty. If you can live with that, though, unsupervised models can provide unique and very valuable insights.

Hybrid Algorithms---The Best of Both Worlds

The best of both worlds are hybrid algorithms that combine elements of both supervised and unsupervised learning methods to couple the relative certainty of supervised learning with the power and novel insight generation of unsupervised learning. One of these so-called hybrid models is reinforcement learning. You might have heard of this type of algorithm if you've read about computers that have trained to beat opponents at games like go, Atari, and chess. Reinforcement learning algorithms basically pair observations and measurements to a prescribed set of actions in the process of trying to achieve and optimize a reward. The computer interacts with its environment in attempt to learn how to master it.

The outcomes aren't known in advance, but desired ones are rewarded. Reinforcement learning can be applied to all sorts of business activities such as risk management, inventory management, logistics, product design. The list is huge. The bottom line is that reinforcement learning can help you discover optimal outcomes that you seek, and it can reveal outcomes that you didn't seek, but that you can leverage to optimize your operations.

Two Caveats

I've only made a very small scratch on the vast surface of ML here. There are other techniques---such as anomaly detection to help detect fraud and bolster risk management efforts---that can help you improve your analytics and increase your bottom line. There are two caveats, however. One is that it's easy to introduce bias into ML algorithms, so you must constantly measure your results against your goals and ethics. Two is that it takes a huge volume of clean data to achieve valid, predictable results with ML algorithms. If you keep those two caveats in mind, it's really a no-brainer decision to make to include ML capabilities into your analytics ecosystem. What are you waiting for? {"mode":"full","isActive":false}

May 6, 2021