Turbofan engine run-to-failure simulation dataset from NASA PCoE with four sub-datasets covering different fault modes and operating conditions; the standard benchmark for Remaining Useful Life (RUL) prediction research in prognostics and health management.
| Associated Tasks | Remaining Useful Life Prediction, Prognostics, Regression |
| Data Source | Synthetic |
| Dataset Characteristics | Multivariate, Time-Series |
| Date Donated | 2008 |
| Feature Type | Real |
| Labeled | Yes |
| Missing Values | No |
| Name | CMAPSS Jet Engine Simulated Data |
| Number of Features | 26 per record: engine unit, cycle, 3 operational settings, 21 sensor measurements |
| Number of Instances | 708 training + 707 test engine trajectories across 4 sub-datasets (FD001: 100/100, FD002: 260/259, FD003: 100/100, FD004: 248/249) |
| Source | NASA Prognostics Center of Excellence |
| Time Series | Yes |
The CMAPSS (Commercial Modular Aero-Propulsion System Simulation) dataset was generated by NASA's Prognostics Center of Excellence using the C-MAPSS simulation tool. It was introduced for the PHM 2008 Data Challenge. Each engine starts with different initial wear and manufacturing variations, and degradation progresses until failure. Four sub-datasets (FD001–FD004) cover different combinations of operating conditions and fault modes: FD001 and FD003 have one operating condition; FD002 and FD004 have six. FD001 and FD002 have a single fault mode (High Pressure Compressor degradation); FD003 and FD004 have two concurrent fault modes.
Each record contains 26 columns: an engine identifier, the operational cycle counter, 3 operational setting columns, and 21 sensor measurements of temperatures, pressures, and fan/compressor speeds contaminated with realistic noise. Training trajectories run from initial condition to failure; test trajectories are truncated at a random prior point. The task is to predict the remaining useful life (RUL) for each test engine, with ground truth RUL values provided separately.
This dataset is the de facto standard benchmark in data-driven RUL estimation research and has been used in hundreds of publications involving LSTMs, CNNs, transformers, and classical regression methods.