Loading data into PathWave Manufacturing Analytics

With the emergence of Industry 4.0, the deluge of devices, test software, and test equipment rushing to support it in the manufacturing environment has increased exponentially over the years. Often, each of these has their own formats, or a wrinkle on an existing industry standard, and yet must be able to flow into Enterprise Resource Planning (ERP) and analytic systems in copious amounts. The ability to now get the right data at the right time becomes even more crucial. One can find that there are many articles discussing data storage systems (Data Lakes, HDFS, HBase) and data retrieval and query solutions (SQL, Impala, Hive). However, you may not often encounter topics about how to collect device and test equipment data. The solution will require covering two closely related areas, data parsing, and the data pipeline. In this post we will be covering, at a high level, the data parsing.

In many large manufacturing environments just taking on the test equipment process alone can involve multiple types of test equipment and software. Each unit built can involve different types of functional-specific equipment as well. It is, therefore, not difficult to imagine that the data generated by each type of test equipment, or software version, may have different formats or even slight variations in their formatting of the data types or values. Data may come as binary or structured text. The files may come as a single file for each unit of product tested or a grouping of the units tested within a day and so on.

When data being collected is not transformed into a consistent format, it will be challenging to analyze or otherwise make sense of it for analytics or a dashboard display. The task then of a user attempting to standardize the data collected may not easily be done manually. This is because of the variety, volume, and velocity of the data produced by some of these Industry 4.0 devices, test equipment, and software. The magnitude of what needs to be done is often outpacing what many teams in a manufacturing environment may be able to handle. While vendors try to produce new equipment and software that aligns with a certain standard, like CFX, to lighten the integration load, there will still be questions of bringing in data from valuable legacy equipment and software too. Hence, the role of tailored data parsing becomes increasingly important to be to handle the types “Big Data” that need analysis.

Some of the challenges that the creation of a parser will try to address are:
• Finding patterns in the source data to automate
• Handling of dirty data
• The level of customization required for each scenario

Finding patterns in the source data to be automated is not a terribly difficult problem to solve. The challenge really comes with the creation of the data parser so that it correctly finds the recurring patterns within the data and transforms it into usable outputs for the next step in the process. Being able to recognize a pattern is just the first step, the challenge that comes next is how to ignore and handle the parts that do not have a pattern and still be able to then settle them gracefully. This requires a level of experience and expertise from the team creating the tools necessary to achieve the desired outputs in as cost effective and efficient a manner as possible.

Dirty data will sometimes destroy the source data processing in an unimaginable way. For example, some data may be correct in terms of format, fits within the specified criteria, however, it is duplicated data that is occurring from the source. Instead of just sending it one time, the source is sending it multiple times or even thousands of times a day. This is not something that the parsing of data alone may be able to catch without some human intervention.

There will be some level customization required for each data parsing scenario. It is, therefore, not unreasonable to expect that each implementation will have some reusable components while others will require certain parts of the data parsing process to be recreated from the ground up. The effort and turn-around time can greatly impact the cost and time of your implementation.

PathWave Manufacturing Analytics (PMA) is a backed by an experienced team of people at Keysight, who can help you achieve the goal of bringing in data from your manufacturing test equipment and software, for analysis and presentation, to help you to gain the necessary actionable insights that can be used from an operator on the line to the highest senior management levels of the business organization. PMA includes a scalable architecture that can grow with your business. It comes with regular software updates that will allow you to be handle the changing business or production requirements in your environment. PMA will free up engineering and development teams to allow them to work on the software and tools that are valuable to your manufacturing lines and business, as well as save time on the production floor with its actionable insights.

PathWave Manufacturing Analytics Figure 1: PathWave Manufacturing Analytics

limit
3