Article

Open Source Data?!

(2 min. read)

It is very difficult to share datasets for condition based and predictive maintenance applications due to the commercial sensitivity of the data involved. We have published a number of datasets that we generated together with Flow Center of Excellence. Next to these there are some good resources available on the internet if you want to learn how to apply (advanced) analytics on data for a CBM/PdM project.

Graph showing real-time data being monitored from industrial machines

These Are Our Favourite Open Datasets:

1. UWA System Health Lab

The data sets provided by the System Health Lab of the Faculty of Engineering and Mathematical Sciences at the University of Western Australia. The faculty provides different datasets that can be used to learn how to apply classification and anomaly detection methods.

UWA open data analytics for predictive maintenance

The datasets are well described and after sign-on you get access.
UWA open data analytics for predictive maintenance

 

2. The Prognostics Data Repository of NASA

The data sets provided by Prognostics Data Repository is a collection of data sets that have been donated by various universities, agencies, or companies.

NASA open data analytics for predictive maintenance

Each of the datasets is individually described.

 

3. UCI Machine Learning Repository

The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. The archive has been around for decades and you can find interesting data, for CBM/PdM, in the engineering data sets:

UCI Machine Learning Repository open data analytics for predictive maintenance

such as Condition monitoring of hydraulic systems:

UCI Machine Learning Repository open data analytics for predictive maintenance

 

4. Google Data Set Search

Google’s Dataset Search is a powerful tool for finding publicly available datasets across various domains, including condition monitoring and predictive maintenance. Here’s how you can use it effectively:

  1. Go to the Website
    Open Dataset Search in your browser.
  2. Enter Relevant Keywords
    Use specific search terms to find datasets related to condition monitoring and predictive maintenance.
    Examples include:
    “condition monitoring dataset”
    “predictive maintenance dataset”
    “sensor data for machine health monitoring”
    “industrial equipment failure dataset”
    “vibration analysis dataset”
  3. Use Filters to Narrow Results
    After getting the search results, use filters (on the left side of the page) to refine your search by:
    Format: Choose CSV, JSON, or other formats depending on your preferred data type.
    Usage Rights: Select datasets that are free to use or open-access.
    Last Updated: Choose recently updated datasets for more relevant information.
  4. Examine Dataset Details
    Click on the dataset titles to see descriptions, metadata, sources, and access links. Check details such as:
    Data provider (e.g., universities, research institutions, companies)
    Data type (sensor data, failure records, maintenance logs)
    Licensing information (whether it’s free, open-access, or requires permission)
  5. Access and Download Data
    Follow the provided links to the dataset’s hosting site (e.g., a government repository, GitHub, Kaggle, or institutional website). Some datasets may require sign-up or permission to access.
  6. Explore Related Datasets
    Google Dataset Search often suggests related datasets at the bottom of each dataset’s page, helping you discover additional useful data.

Additional Tips:

  • Use Boolean Operators: Try “predictive maintenance” AND “sensor data” to refine results.
  • Search for Specific Industries: Use “wind turbine predictive maintenance dataset” or “automotive condition monitoring dataset” for more targeted results.
  • Check Multiple Sources: Some datasets are hosted on repositories like Kaggle, UCI Machine Learning Repository, or government portals.

 

5: Kokiwbt

The GitHub repository kokikwbt/predictive-maintenance is designed to provide quick access to various datasets pertinent to predictive maintenance (PM) tasks. These datasets encompass a range of industries and equipment, offering diverse features suitable for condition monitoring and predictive analysis.

Available Datasets:

The repository includes several datasets, each with unique characteristics. Here’s a summary of some notable datasets:The image is a table summarizing datasets from the kokikwbt/predictive-maintenance GitHub repository, which provides datasets for predictive maintenance and condition monitoring. The table includes columns for Dataset Name, Timestamps, Sensors, Alarms, Remaining Useful Lifetime (RUL), and License. It highlights whether each dataset contains time-series data, the number of sensors used, alarm signals, and RUL data availability.Datasets marked with an asterisk (*) are particularly rich in attributes and may be prioritized for detailed analysis. The term “RUL” refers to the Remaining Useful Life, indicating whether the dataset includes information on the expected operational lifespan before failure.

The repository also offers Jupyter notebooks for each dataset, facilitating interactive data processing and visualization. These notebooks can serve as practical guides for analyzing and understanding the datasets.By leveraging this repository, you can efficiently access and analyze a variety of datasets tailored for predictive maintenance and condition monitoring applications.

 

Ready to Explore Our Demo?

If you’re looking for a proactive approach to asset performance and reliability, check out our interactive demo, which is a great place to start. Experience firsthand how APM Studio’s dashboards, predictive analytics, and advanced alarm management come together in a single, user-friendly platform. Click through each step to see how maintenance teams can spot issues quickly, dive deeper with root-cause analysis, and make informed decisions to prevent downtime:

Bring Your PdM Solutions Live with APM Studio

If you enjoyed reading this article and you want to bring your PdM applications live on streaming data, make sure to check out our APM Studio page!

Related Articles