We live in a world where “data is the new oil” or “data is capital” and many other variations on the theme that data is a valuable commodity. I agree with these analogies, with specific care given to the idea that data is a commodity and needs to be refined or employed to maximize its value.
You have probably heard about the growth in data volumes that the world is producing. One of the most compelling statistics about data growth is that 90% of the world’s data has been created in the last two years (Attunity, 2018)! Couple that with the projected growth in IoT devices – potentially 20 billion in 2020 (Statista, 2019), and we will be doubling the volume of data even more quickly. We have been, and will continue to be, awash with data. While we can, and should talk about how we might leverage this new river of data, I’d like to focus on two things a bit more practical and useful today.
First, open data from governments and NGO’s. One of the consequences of our increasingly data-driven society is the response of governments to make the data they collect and generate open to its citizens. People pay governments through taxes and fees, and the concept that the data generated by these governments is or should be owned by the taxpayers is being embraced globally. Government open data portals are becoming common and more valuable each year. Data contained in these portals contain very valuable information about demography, geography, economics and public safety. Data on product safety incidents, recalls, potentially hazardous products and much more can be found in these portals or on the websites of safety regulators.
As TIC professionals, we have an obligation to leverage this government data to inform our safety and compliance programs. The insights from incident trends, specific incident conditions and people’s behavior can ensure that the standards we write and services we provide for product safety and conformity are addressing the current state of safety, not an historical view of the way things were. Using real time data feeds and modern database technologies, these incident data can be captured and analyzed as quickly as the data are released by the government agencies. One of the key technologies that may be leveraged to gain these insights is natural language processing (NLP). Using NLP, the unstructured data contained in incident narratives and consumer reports can be analyzed, categorized and visualized in ways that can provide specific information on modes of failure, user behavior and environmental conditions associated with the incident. This provides rich information, not just data, that can be leveraged to improve standards, test protocols and inspection programs.
The second topic of practical concern is the un-used and un-analyzed data that is generated in the course of conducting TIC activities. Many TIC organizations are so focused on the outcomes and deliverables of their services – the inspection or test report and certification decision – that the supporting and ancillary data are never accessed, or worse yet, discarded. If the organization is only focused on Pass/Fail or Acceptable results, valuable trending data, benchmarking statistics and failure mode information will be lost. These data, once again can be useful in improving the overall performance of the components, products and systems that are evaluated through TIC services.
While most will agree that discarding or ignoring data is a bad idea, there are powerful disincentives to capturing and storing additional data. The added time and expense to ask a laboratory technician or inspector to capture the additional data may seem to be at odds with operational efficiency. This is where the engineering staff and data science team have to be able to be able to quantify and show the value of the additional data. Technology is also at work here, driving the incremental cost of data capture and storage down each and every day. Those IoT sensors mentioned above – they will make it trivial to capture volumes of data that will prove to be valuable in the long run.
The bottom line
My advice – leverage the data you already have access to – through open data and your own organization, to improve our industry and the safety and compliance of the products and systems we use every day.