عنوان مقاله [English]
In-situ observations underlies a wide range of planning, applied studies and modeling in various fields and sciences, and using this data in studies and planning without ensuring the accuracy and homogeneity of them, can lead to uncertainty in the results. The major problems that researchers face is the poor data quality, missing data, outliers and in-homogeneity in time series. Inappropriate co-locating of stations, human errors in reading and recording data, errors in measuring equipment, changes in measurement tools, different methods of observation, non maintenance and calibration of equipment, constructions around the stations, changes in the type of instruments and sensors for atmospheric parameters measurement and station relocation during the statistical period are problems that affect the accuracy and homogeneity of the meteorological data. Therefore, in this paper, the minimum and maximum daily temperature series and daily rainfall series at 134 weather stations in Iran were analyzed for outliers and homogeneity over the period 1989-2018. First, Iran was divided into 5 clusters based on climatic characteristics. After clustering, the daily maximum and minimum temperatures and daily rainfall data were statistically analyzed using SPSS software and the percentage of missing data was determined separately for each station. Then, Climatol package in R software was used to study outliers, in-homogeneity and homogenization. In each cluster, the series are re-clustered based on the interested parameter, and for each station, the other stations belonging to that cluster are considered as reference stations. Based on this algorithm, first the desired series is estimated and standardized by reference series by type (II) regression method. After estimating the series, the standardized anomaly series is calculated, in which the difference between the observed and estimated values is calculated. For detecting outliers, two steps were followed. Original data corresponding to standardized anomalies that were greater than the prescribed thresholds, were detected as outliers. In the second step, in order to ensure the correct detection of the outliers, for temperature series the detected outliers in the first step were compared with the values of the days before and after. If they differed significantly, they would be accepted as outliers and deleted. For the precipitation series, the atmospheric condition of the desired dates would be checked. For detection of in-homogeneity, the standard normal homogeneity test (SNHT) was performed on the monthly series. If the SNHT test statistic was greater than the prescribed threshold, the series was split at the point of maximum SNHT and all the data before the break were transferred to a new series with the same geographic coordinates. This process was repeated until all series were homogeneous. If break points were confirmed by metadata, they would then be accepted as non-climatic breaks. Finally, all the missing data in all homogeneous series and subseries infilled with the same data estimation procedure using only the reference of their own other fragments.
The maximum, minimum temperature and precipitation series for 134 weather station of Iran have an average 3%, 4% and 2% missing values, respectively. In this time series, 63 outliers were detected for the maximum temperature parameter that 53 of them were related to the Geophysics station of the University of Tehran. For the minimum temperature, this number reached 50 that 11 of them belong to the Geophysics station and for the precipitation parameter 13 outliers were identified that 5 of them are related to the Geophysics station. For the daily temperature series (excluding geophysics station), 89 stations were homogeneous and 44 stations had one or two break points, and for the precipitation series 15 stations were identified as in-homogeneous.