Anatomy of a modern data stack and 4 key benefits it creates

Construction firms that know how to harness their data are increasingly at a competitive advantage in today’s complex world — and the latest research emphasizes just how much of an advantage.

In fact, faulty construction data may have caused $1.8 trillion in losses worldwide and been responsible for 14% of avoidable rework, or $88 billion, according to Autodesk and FMI.

That same report found that 75% of contractors said there’s an increased need for rapid decision-making in the field — exactly where good data is crucial. But only 55% of contractors had implemented a formal data strategy for project data, and only 12% always incorporated project data into their decision-making.

The solution to these problems is the modern data stack. But just what is it, and why should contractors care?

“The modern data stack is a scalable, low barrier to entry, group of technologies that firms can adopt to drive value from their data,” said Matt Monihan, CEO of ResponseVault, a data-engineering firm specializing in the construction industry. “That’s important because, with the modern data stack, you can surface data without every single app having a direct integration with another one.”

Monihan said the goal is true data integration, which many construction firms mistakenly believe they have achieved because of questionable claims from software-makers about integration. But, while integrations may technically be available, they don’t always provide true data insights firms need to make smarter decisions and predict outcomes. “The granularity of the integration is key and varies between vendors,” Monihan said.

In this article, we’ll explore the anatomy of the modern data stack — and answer questions about four key benefits it creates.

Anatomy of the modern data stack

 

Point solutions

Point solutions are where your data originates. Whether it’s coming from the field, the office or the owner, your data is being collected in a structured form, like Procore change events, or from free-form data sources like Spreadsheets. The data generated from these point solutions run your business, and the solutions are made to collect the data properly.

Data ingestion and sync

Once you’ve collected your jobsite data in point solutions, the next step is to securely and reliably export that data into a storage container, often referred to as a data warehouse, data lake or even a data lakehouse. As technology evolves, the differences between those industry terms have blurred, but what’s important is that a piece of middleware is required to move the data between the point solution and its staging area in the warehouse.

Storage

The data we’re extracting from our point solutions needs to live somewhere, and that is where our storage method is chosen. The cost of entry to this component has reduced, both with the introduction of Amazon Redshift as a lower-cost analytical database and with the rise of accessible, open-source databases introducing features that enable many use cases that weren’t previously possible in years past. So, once you’ve selected and set up your storage and data is flowing, next is doing something with the data: Analysis.  

Analysis

Now is the time to model the data across your data sources, identify fields that combine disparate data sets, and clean the data into unified models. This step requires a dedicated data analyst who can communicate with people in the field who are generating the source data and reconcile any discrepancies with stakeholders looking for reports and dashboards. Your analyst will need to work with many Business Intelligence (BI) tools available in the market today. No one BI tool covers everything you may need. What’s important is that you pick one and stick with it.

Management/Monitoring/Reliability

This step is where you’ll need a layer that continually tests and lets you know that the entire stack is functioning correctly. It’s where you’ll receive alerts about outages, expired authentication tokens, broken models and more.

The goal is to have a robust and comprehensive set of tools to not only proactively monitor the stack’s health but also provide a straightforward process for investigating problems that may arise.

Key benefits

Now that you understand the anatomy of a modern tech stack, it’s essential to understand its benefits.

1. How does the stack reduce manual data entry and inconsistent data? Manual data entry and conflicting data are collectively costing contractors billions. These twin problems often cause multiple apps and tech tools companies use to gather data. Although processes are digitized, the apps and tools don’t integrate; thus, data becomes siloed in the various solutions. To get a report, someone must manually go into the different tools or apps and export the data. 42% of companies use four to six apps for their construction jobs, and 27% report that none of those apps integrate. As a result, data is transferred manually nearly 50% of the time, according to JBKnowledge. Because it integrates data across apps and systems, modern data stacks provide digital access across the enterprise — and a single source of truth.

2. How do disparate data sources get connected? Although many construction IT professionals know about major database suppliers, such as Oracle, they may not be aware of the many better off-the-shelf data tools available to integrate data. These connectors save considerable IT time and cost and ensure the data is secure and reliable — with a trusted single source of truth. In that scenario, BIM changes can be made in a 3D render using real-time job site data. Then, if an architect or owner wants to make a change, a cost analysis can be done in real-time based on how the project is going.

3. Where does the data go? The modern data stack relies on a data warehouse that’s typically cloud-based. This cloud data warehouse can store and quickly access large quantities of data without breaking the bank. It ensures that all company data flows appropriately to this data warehouse so that firms can gain actionable insight into real-time data. That’s key because poor project data and miscommunication cause 52% of rework. This means $31.3 billion in rework was caused by bad project data and miscommunication in the U.S. alone in 2018, according to FMI/Autodesk report.

4. How do we ensure that the various integrations are working properly and that data is correct? Unlike black-boxed integrations, the modern data stack provides a clear picture of data, so it’s easy to see, manage and manipulate. For example, the stack includes orchestration tooling, data management frameworks and data-quality monitoring tooling for high observability — other tools, such as data governance, surface organizational problems.

The result is the ability to create meaningful data dashboards that get insights that matter, such as subs’ quality of work, safety scores and schedules. “With the modern data stack, you have a window into exactly what’s happening,” Monihan said. “It’s not a black box. It’s a transparent system for diagnosing, understanding and customizing how integration works.”

Leave a comment