Skip to main content

How Do We Collect and Process Data?

Learn how we collect, structure, and maintain product information.

Updated yesterday

Acelab’s mission is to make the process of discovering, comparing, and selecting building products more transparent and efficient. At the core of this mission is our comprehensive and ever-expanding product database — a structured, searchable system designed to help architects, designers, and building professionals make better materials decisions.

Here’s everything you need to know about how our data is collected, structured, and updated.


Why do our data collection and processing methods matter to you?

Our structured and comprehensive approach to data collection helps you:

  • Compare apples to apples by normalizing technical specs.

  • Trust the data by referencing direct source documents and certifications.

  • Search smarter with metadata tags and filters.

  • Save time by auto-populating schedules with high-quality data.

  • Stay unbiased by seeing the full market landscape beyond what’s popular or sponsored.

Keep reading to learn more!


What is the Acelab product database?

The Acelab product database is a centralized collection of product information from across the building materials industry. It includes data for thousands of building products, organized in a way that makes it easy to search, compare, and evaluate products based on a wide variety of criteria, from technical specifications to aesthetic options to sustainability certifications.

Currently, the database includes products across 40 major categories and will soon grow to 100 categories. It will ultimately expand to over 400 categories and thousands of sub-categories.


What kind of data do we collect?

We collect publicly available and manufacturer-supplied product information across several key areas:

  • Identity Data: Product names, product lines, manufacturer names, and contact details.

  • Aesthetic Data: Images, descriptions, color options, and finishes.

  • Performance Data: Technical specs like design pressure, fire resistance ratings, acoustic performance, and structural capacities, plus associated certifications (e.g., ASTM, ISO).

  • Sustainability Data: Recycled content, VOC data, Environmental Product Declarations (EPDs), and certifications like Greenguard, FSC, and Declare.

  • Documentation & Details: Brochures, CAD/BIM files, MSDS (Materials Safety Data Sheets), test reports, and warranties.

Coming soon: We are working on expanding our dataset to include cost, GWP, chemistry, and factory locations.


How geographically extensive is the database?

Our database covers primarily North America. Most of the manufacturers included sell in the United States, many in Canada, and some in Mexico. The database also includes manufacturers outside of North America with established support and distribution in North America.


How do we collect data?

We gather data through a combination of:

  1. Industry Knowledge & Research: We use a combination of our industry experience, user feedback, and public directories to build unbiased lists of manufacturers, seeking out a wide variety of brands, from commercial to residential, from mass produced to bespoke niche players.

  2. Web Scraping & AI Tools: We combine human data entry, AI scraping, and algorithmic tools built in house to collect baseline product data from manufacturer websites and other publicly available sources.

  3. Manufacturer Input and Certifications: We supplement this data with input from manufacturers and data from over 1,500 certification and testing sources.

  4. User Feedback: Input from Acelab users helps us prioritize which data to collect and flag missing products.


How is data processed and structured?

We do not just scrape, we normalize and tag data to ensure it's consistent and comparable across products.

  • The names of materials and similar characteristics are standardized to allow for easier comparisons. Original information is retained in the manufacturers descriptions of products.

  • Each data point (such as a dimension or certification) is normalized. For example, all dimensions are converted to inches, and all R values to Imperial R values (ft²·°F·h/BTU).

  • Tags are added to products to improve searches. For example, the Mindful Materials Health tag is added to products with a certification that meets the Mindful Materials Framework.

  • Masterformat sections are assigned to products for products when one isn't listed.

Note: This structure allows for smarter searches, easier comparisons, and cleaner integration with your schedules and workflows.


How do we ensure quality (QA/QC)?

Data accuracy is critical. We use a multi-layered quality control system:

  • Manual Review: All human-collected data is double-checked by our QA team.

  • AI Verification: AI-collected data is reviewed by humans and spot-checked algorithmically.

  • Automated Controls: The system flags incomplete or inconsistent data and out-of-range values. We correct obvious errors, but skip data which appears to be incorrect and can’t be verified. Products with missing data on the manufacturers website may also be missing data on our database.

  • Periodic Updates: We spot check websites for changes and flag those sites for rescraping. We update product data every 4–6 months, with oldest data being no more than 2 years old.

  • Error Reporting: Users and manufacturers can report errors, which we investigate and use to improve future data accuracy.


How is the database updated?

  • Daily Collection: We collect and review new product data daily.

  • Weekly Uploads: Products are uploaded in weekly bundles, currently by category.

  • Community Input: Products flagged by users as “missing on Acelab” are prioritized for inclusion. User feedback from onboarding also helps guide our scraping efforts. We respond promptly to fix quickly any mistakes a user or manufacturer reports to us.

  • Unavailable Data: The most common error reported by users is missing data. Unfortunately, we’re limited to publicly available data, which may be incomplete.

Coming soon: We’re migrating to a new system in 2025 that allows for more flexible updates outside of category bundles.


What’s Coming Next?

We’re constantly improving how we collect and manage data. Here’s what’s on the horizon:

  • A new research database to handle higher data volumes.

  • Expanded categorization aligned with industry standards and user feedback.

  • More API integrations with partners like Mindful Materials.

  • Better user communication tools for reporting errors and receiving updates.

  • More automation for faster updates and fewer manual errors.


Did this answer your question?