Search for a command to run...
The quality of wind power output data directly impacts the assessment of wind farm operational status and the accuracy of power forecasting models. However, due to factors such as sensor precision, communication interference, and the complex harbor environment, raw data collected from port-area wind turbines often contain noise, outliers, and missing values. Without effective cleaning, the resulting power curves can be distorted, reducing the generalization capability of predictive models. To overcome the limitations of traditional outlier detection methods in terms of adaptability and robustness, this study proposes a two-stage port-area wind power data cleaning approach based on dynamic interquartile range and an improved Sigmoid function fitting. In the first stage, an adaptive binning and density-weighting mechanism dynamically expands the interquartile range to identify and remove local outliers across different wind speed intervals. In the second stage, the cleaned wind speed–power data are subjected to secondary fitting and residual analysis using an improved Sigmoid model to detect hidden anomalies and boundary-type outliers. Using measured data from the #1 WT in the Chuanshan Port area as a case study, the experimental results demonstrate that the proposed method achieves high data retention while outperforming the conventional interquartile range, density-based spatial clustering of applications with noise and isolation forest algorithms in terms of the Pearson correlation coefficient (r = 0.93) and the coefficient of determination (R2 = 0.89), with mean squared error and root mean squared error reduced to 446.39 kW and 545.58 kW, respectively. The findings verify the efficiency, stability, and practical feasibility of the method for port-area wind power data cleaning, providing a reliable data foundation for wind power forecasting and operational optimization in port environments.