AI Audio Summary and Discussion
Want to learn about this topic in the background? Give this AI-generated audio summary a listen (created using NotebookLM). Start listening for easy summarization… stay for funny pronounciations of “Reevit”.
Data management tactics relating to quality and remediation have become essential for unlocking insights into AEC project information for the purposes of analysis and decision-making. The concept of ‘Data Classification’ has emerged as a pivotal yet often overlooked pre-requisite in the realm of data analysis for the building industry. In past blog articles, we have explored how classification of building spaces and elements has led to more robust data analysis for our customers. The challenges of realizing these benefits are clear and very common. That being said, there are means to situate data for analysis already, and the outcomes can certainly be dazzling. When it comes to scaling for analysis, understanding why spending effort on adding classification data is key.
This article provides an overview of the what, why, and how of data classification for the building design and construction industry. I will explore what data classification is and some industry standards that can be used to classify industry data points – such as BIM elements. Further, I will explore why data classification is important to modern design and construction practices to support business operations and create value. Finally, I will discuss examples for how data classification methods can be implemented to improve data quality and lead to better outcomes with emerging trends – such as AI.
Data Classification Defined
Data classification is a process of tagging or adding additional information in order to make it more searchable, compliant, and efficient to manage. Typically, a classification system is used which provides clear definitions for each unique instance to be classified. Classifying data creates a common language for a dataset, adding clarity to the data itself without altering the original data.
Classifying data creates a common language for a dataset, adding clarity to the data itself without altering the original data.
Within the building design and construction industry, professionals often have to utilize many sources of data that are often decentralized and lacking in common classification schemes. This can make the task of analyzing project and industry data time consuming and prone to error. A prime example of this is in the use of building information models which are a type of database. These models include physical and tangible components like equipment, furniture, and doors, but also concepts like the function of a space (often defined as a room name).
In this context, data classification could be defined as providing a common set of categories, labels, identifiers, and definitions to common data types within the construction disciplines. The work to establish these kinds of classifications have been undertaken and are well established – however they are not well adopted at large.
One common type of classification is the OmniClass Construction Classification System. This article will use OmniClass as an example of classification – although it is certainly not the only example of an industry standard (Uniformat is another example). OmniClass offers a variety of tables defining elements and concepts that exist in a construction project. A few major examples include, from their website:
- Table 11: Construction Entities by Function – define uses of buildings/facilities.
- Table 13: Spaces by Function – area of a built environment delineated by physical or abstract boundaries and characterized by function.
- Table 21: Elements (includes Designed Elements) – major components, assemblies, or equipment in a facility.
- Table 31: Phases – stages and phases of the lifecycle of building components.
- Table 41: Materials – substances used in the construction and manufacture of building components.
Businesses don’t need to use a system as extensive and rigid as OmniClass – but it is an example of how a system of standard labels can create data alignment between projects, markets, and participants. Taking OmniClass as a framework, custom classifications can be either incorporated into or inspired by these tables.
Utilize Data Classification for Efficient Analysis
Data classification’s primary purpose is to make various uses of data more efficient. Many firms design and build multiple projects per year, spanning different clients, markets, and regions. Each of those projects have unique space requirements along with client and jurisdiction influences on naming conventions for those spaces. Because of differences between projects, it becomes impossible to perform analysis across a firm’s portfolio of projects without a classification backbone.
With a data classification strategy, designers and owners can begin to make sense of their portfolio in totality.
Using rooms as an example, while a human may understand similarly named-spaces may serve the same purpose in a building, adding classifications based on uses and functions of each space allows for comparisons to other projects that also employ the same classification system. Additionally, we have found it increasingly common for human error to disrupt consistent naming across and within projects.
With a data classification strategy, designers and owners can begin to make sense of their portfolio in totality. Using a common language about building components makes searching and analyzing building components possible, relieving challenges brought by variations in naming requirements. Gaining the ability to make relationships and calculations through consistent labeling leads to greater ability to analyze information about a building or collection of them in a reliable and predictable way, too.
Apply Data Classification to Your Building Information Models
Determining how to apply classification requires a bit of planning: data managers must consider and prepare the facets and functions of their buildings which are most important to situate classifications that empower their analysis needs.
1. Identify the core parts of your portfolio that will benefit from classification.
For designers, consider the unique requirements of each project. Corporate interiors and healthcare, as examples, have specific design requirements for types of rooms, amount of area per room, and common fixtures and equipment in each. Classifications that accommodate project typology specific building components (like lab spaces and the unique equipment within it) should take priority.
For contractors, quantities of elements that have specific procurement timelines or are high-ticket cost items may direct special needs for classification.
For owners, focus on area relationships and makeup of spaces that have a direct business impact. Additionally, developing classifications for types of fixtures (retail fixtures, specific desk components, for example) that are unique to an owner’s business operations are also paramount.
2. Select, modify or create classifications.
As mentioned, well established classification schemes already exist within the AEC industry. Omniclass has a variety of tables specifically aligned with many of the needs discussed here. These offer a great starting point that can situate classification efforts, eliminating the need to reinvent the wheel.
However, there are instances where these out-of-the-box solutions may not perfectly align with every use case. For example, while OmniClass Table 13 is useful for classifying rooms in a building, it may not have the same granular specificity as an owner’s space requirements: while a classification for a conference room is provided, alteration to accommodate multiple types of conference rooms may be necessary.
Additionally, some owners may have their own procurement methods for fixtures. Instead of selecting OmniClass Table 21 to classify elements, an existing list of model names or internally-referenced codes for furnishings or equipment may be perfect for this use case.
Providing clear requirements for the types of elements to be classified within models, along with a standardized list of classifications to be used, must be provided for users who manipulate model data, along with data analysts who utilize the data.
3. Label elements as needed.
Autodesk Revit, a common BIM-authoring platform within the building industry, has built-in parameters for classification for common elements, setting the stage for this type of effort. In addition to manual classification of instance parameters, there are tools provided to assist in accessing OmniClass tables within Revit.
BIM data managers can consider methods for making the classification process easier:
- Add classifications to room objects that exist within project template files (bonus if this can accommodate specific owner or market needs!). When a user places a room on a plan, classifications will already be applied.
- Add classifications to family files within a standardized content library. Just like rooms, when a user places a family in a model, classifications will already be applied.
- Use schedules to identify missing classifications within a model. Train users on the expectations and provide insight for corrective action when needed.
- Create helper scripts or database workflows to correct classifications. Dynamo or python workflows may assist users to identify and correct mistakes within classifications against predefined expectations.
Proving Ground provided research and analysis on the exact topic of classification and machine learning. In the April, 2020 publication of Design Transactions, Nathan Miller and Dave Stasuik describe a novel workflow to employ a Naive Bayes Classifier algorithm to create a trained, classified dataset used to predict the classification of incomplete data provided received as new input (see pages 68-73).
4. Achieve the data analysis outcomes desired.

Once BIM data has been classified, a common practice is to export and utilize this data in data visualization platforms (such as Power BI, Excel, and Tableau). Analysis can accommodate multiple goals, for example:
- Understand the relationships between elements within a project by comparing and visualizing room areas and elements within them by department.
- Understand relationships across multiple projects by comparing model data across aggregated exports.
- Add external data using a data relationship to inform design decisions by connecting space utilization or sales data.
- Analyze outcomes using different dimensions by filtering data by design option or by changes over multiple phases.
Data Classification and the effectiveness of AI
One of the hottest technology topics right now is artificial intelligence and machine learning. The effectiveness of ML and AI greatly depends on the availability of well curated data – today, much of the AEC industry’s data exhibits the result of decades worth of low data standardization and classification. The reasons for this are muti-faceted and largely due to how the business of design and construction has been conducted. To realize uses of AI and ML that can produce useful and exacting outputs – whether it be automated code compliance checking or space plan generation – the concept of data classification will be essential as a foundational concept.
How Proving Ground can help
Many of Proving Ground’s apps empower users to gain insight and analyze information from building information models through data extraction for use in common business intelligence tools (we believe in a data-driven building industry, after all). Many of our customers use these tools to analyze program data, fixture and quantity take-offs, model health performance, and even site/campus analysis from Rhino, Revit, and IFC models.
In the end, we’ve been able to assist multiple clients to achieve their data goals by exploring the merits of classification. Each client had unique goals in mind and the application of classification was vetted as providing benefit for these situations. While classification can lead to extra work for teams, understanding the purpose behind the effort, in support of new outcomes and added value, can lead to commitment.
Next steps:
Training your own algorithms article
Weekly workflow: Train an algorithm to label spaces for you
Define your data innovation strategy
