By Patrick Hall
Quality inspectors on assembly lines for high-tech electronic components have long looked for defects by taking sample parts from the assembly line and examining them under a microscope. The problem is that when human inspectors get bored, their attention can slip, enabling defective parts to make their way to customers.
Imagine what would happen if a manufacturer could simply use instrumentation to scan the part and then automatically evaluate whether the scan shows problems. Not only would quality assurance become much faster and more accurate, and catch problems earlier, human inspectors could be freed to do less repetitive work.
This type of automation is coming soon, and the technology that makes it possible is a type of machine learning called deep learning.
In my job as a data scientist, I spend my time driving machine learning research and making it work in the real world. Deep learning is my favorite area of research.
Deep learning is a highly intelligent form of machine learning that takes advantage of neural networks with many layers. It can make extremely accurate predictions. Deep learning has contributed to a number of recent breakthroughs in pattern recognition for unstructured data, such as text, photos, video and sound.
Developing applications based on deep learning is a two-phase process. First I build and train the deep learning model. When people talk about analytical modeling, they tend to focus on this first, model-building task. But even more important is the second step – deploying the model into a company’s operations to create value.
Creating your model
As with any data science effort, 80 percent of the work is devoted to taking messy data and preparing it for further analysis. The challenge with cleaning data for use in all types of machine learning is that each data set is different, and understanding a certain data set typically requires considerable “tribal” knowledge. For example, in one data set “99” might be a number, while in another “99” might represent a code. To master this tribal knowledge, I might need to interview the person most knowledgeable about the data set.
Once I’ve uncovered these nuances, it’s important to encode them in some manner for future reference, such as publishing them on a wiki, so others and I don’t have to spend time reinventing data cleaning
The next step is to define the aim of the analysis. Do my results need to be interpreted by humans, or will they be used by a machine? Do I need to do root cause analysis of past events or make predictions for the future?
If I want humans to interpret the results, I’ll probably use statistics for my analysis. Statistical models are often easier for humans to interpret because they were designed for making inferences about a population from a sample of data. Statistics are very useful for root cause analysis of something that occurred in the past.
Machine learning, including deep learning, is best used when accuracy is more important than interpretability. Machine learning usually creates complex models with hundreds of thousands of parameters or rules that will be implemented later by other machines. It can be used on complex, unstructured data sets, or as a more predictive alternative to statistical modeling on structured data sets. In addition to predictive analysis, machine learning is useful for recognizing patterns in structured or unstructured data, such as groups of similar patients or finding cats in YouTube videos.
To create a deep learning model, I start with a random set of parameters arranged into a network architecture that is known to predict well over a given data universe. Deep learning models are often more complex than other types of statistical or machine learning models; it’s not uncommon for deep learning models to have billions of parameters and very complicated architectures.
By applying a training process systematically and iteratively to the network, I can steadily and incrementally change the parameters to better represent the data and improve on the model over time. Deep learning models often require sophisticated training techniques.
At this point, I might visualize the machine learning model and its results to help other data scientists, business partners and decision makers understand the value of the work. This is an important step in convincing the organization to trust machine learning models and to use them to make decisions.
Operationalizing the model to drive value
In my work, I strive to take machine learning a step further by using it in an operational manner to create value. In other words, once I’ve developed and trained a model, I look to use it to enable a machine to make decisions automatically. For example an operationalized model can be used to drive machines to automatically find defects in parts on the production line.
To create these automated processes, the model developed in a high-level language, such as Python, is often turned over to a software developer who will translate it into a low-level language, such as C or Java, so it can run in a low-latency web service or on an assembly line.
Previously this process required considerable manual translation. Today, tools are available that aide in the process by generating a portable artifact such as score code or a scoring executable that can be executed by an external environment such as a database or web service.
With these tools, it’s becoming increasingly easy to go from complex modeling to real-world results.
Patrick Hall is a senior director for data science products at H2o.ai where he focuses mainly on model interpretability and model management. Patrick is also currently an adjunct professor in the Department of Decision Sciences at George Washington University, where he teaches graduate classes in data mining and machine learning. Prior to joining H2o.ai, Patrick held global customer facing roles and R & D research roles at SAS Institute. He holds multiple patents in automated market segmentation using clustering and deep neural networks.