Search for a command to run...
FHC Twin-Plant Photovoltaic Anomaly Detection Dataset This dataset was collected as part of a multi-month measurement campaign at FH Campus 02 in Graz, Austria, and supports benchmarking of time-series anomaly detection (TSAD) algorithms for photovoltaic (PV) monitoring applications. The dataset originates from two physically identical PV plants operating under identical environmental conditions. Each plant is equipped with one inverter (Hoymiles HM-1500) and four strings, where each string consists of a single PV module (Risen Energy Titan S RSM40-8-400MB). Electrical measurements are recorded at a temporal resolution of 30 seconds and include string-level DC voltage and current for all four strings. Environmental variables — solar irradiance, ambient temperature, wind speed, and wind direction — are also recorded. One plant is operated under normal conditions and provides a fault-free reference for training semi-supervised TSAD algorithms. The second plant is deliberately modified to simulate realistic PV faults. The present release covers 25 days of measurements between 17 June 2025 and 16 July 2025, focusing on two physical fault scenarios: partial shading (simulated using sheets of paper of sizes DIN A5 and A4) and induced mismatch (achieved by altering the tilt angle of selected modules). Note: measurements are not available for June 24–26 and July 8, 2025 due to problems with our recording equipment. In addition to the physical faults, synthetic anomalies are injected into the electrical measurements of the modified plant to simulate common sensor and data-quality issues. Injected anomaly types include abrupt spikes, signal dropouts (i.e., zero values), scaling effects, and additive noise, applied to string-level voltage and current signals. Features: Column Description Unit timestamp Measurement timestamp (30 s intervals) YYYY-MM-DD HH:MM:SS S1_A DC current of string 1 A S1_V DC voltage of string 1 V S2_A DC current of string 2 A S2_V DC voltage of string 2 V S3_A DC current of string 3 A S3_V DC voltage of string 3 V S4_A DC current of string 4 A S4_V DC voltage of string 4 V SolRad Solar irradiance W/m² T_o Ambient temperature °C W_Dir Wind direction ° W_Speed Wind speed km/h anomaly_class Anomaly type (see anomaly types table) — Anomaly types: The dataset contains nine anomaly classes, covering both physically induced faults and synthetic data-quality anomalies: Label Type Description 0 Normal No anomaly 1 Partial shading (A4) Module surface partially covered with a DIN A4 sheet 2 Partial shading (A5) Module surface partially covered with a DIN A5 sheet 3 Induced mismatch Tilt angle of two modules increased, reducing incident irradiance 4 Current dips Transient reductions in string-level DC current 5 Spikes Single-point anomalies in voltage or current signals 6 Dropouts Zero values persisting over a period, simulating signal loss 7 Scaling Multiplicative scaling effect applied to a signal 8 Noise Additive noise injected into voltage or current signals 9 Stuck sensor Sensor value remains constant over a period Labels 1–3 represent physically induced faults; labels 4–9 represent synthetic anomalies simulating sensor failures and data-quality issues. Note: labels 4 (current dips) and 5 (spikes) are both transient deviations and are closely related in nature; they are not discussed as separate fault types in the accompanying paper. File description: test.csv contains contaminated test data obtained from the modified PV plant. This file should be used for evaluation and for fitting of unsupervised algorithms. train.csv contains fault-free data from the "normal" PV plant. It can be used to train semi-supervised algorithms. Dataset statistics: Total duration: 600 hours Number of labeled anomaly segments: 30 Anomaly contamination: 5.33% Minimum anomaly length: 30 seconds Median anomaly length: 36.5 minutes Maximum anomaly length: 406.5 minutes If you use this dataset, please cite our paper: Bradl, H., Hofer-Schmitz, K., Grippa, P., & Hofer, G. (2026). Benchmarking Time-Series Anomaly Detection Algorithms for Photovoltaic Plants. Proceedings of the European Conference of the Prognostics and Health Management Society 2026.