Verification of MM5 forecast precipitation over Iran



During the last several years, application of numerical weather prediction models in the country have become common in both research and operations, even though systematic verification of the models’ results using statistical methods has rarely been conducted (Sodoudi et al. 2009). This paper aims at comparing the MM5 24-hour precipitation forecasts with the corresponding observations using standard scores for categorical forecasts associated with 2×2 contingency tables for different precipitation thresholds over different nine sub-regions of Iran. Comparison is conducted for +24h/+48h/+72h forecasts for a four winter month period from December 2004 to March 2005. Performance of the model results were assessed for all available synoptic and climatological stations scattered across the country at three different precipitation thresholds. The 0.1 mm/24h threshold was considered as the rain/no rain event. The other two intervals are: 0.1-10 and greater than 10 mm/24h for light and heavy precipitation respectively. Based on the long term means of precipitation of different parts of Iran, nine different sub-regions were defined and verification was conducted for the whole country and nine different sub-regions separately.
In this study for verification scores the quantity of precipitation is considered as a dichotomous variable by considering different precipitation thresholds. The standard approach is to record the frequencies with which the precipitation was observed and forecasted in a two-by-two table, and then to quantify forecast quality with summary measures of the table. The structure of a typical contingency table is presented in table 1.

Table 1. Rain contingency table is applied at each verification observation site over the period of verification. A threshold value (e.g., 0.1 mm day-1) is chosen to separate rain from no-rain events. Here, a is the number of correct rain forecasts or hits, b is the number of false alarms, c is the number of misses, and d is the number of correct predictions of rain amount below the specified threshold. From McBride and Ebert (2000).

Predicted Rain No rain
Rain a b
No rain c d

The verification scores used in this study are as follows:
Threat score (TS), or critical success index (CSI). In terms of table1 the threat score is computed as

The worst possible threat score is zero, and the best possible threat score is one.
The bias

Unbiased forecasts exhibit B = 1. Bias greater than one indicates that the event was forecasted more often than observed, which is called over forecasting. Conversely, bias less than one indicates that the event was forecasted less often than observed, or was under forecasted. Regarding only the occurrence of event as “the” event of interest, the hit rate is the ratio of correct forecasts to the number of times this event occurred. It is

False alarm rate which is the ratio of false alarms to the total number of non occurrences of the event is

Examining the calculated scores, show that the model forecasts of the rain/no rain event for the four month period over Iran is correct for 80% of the times. But in general results show an over forecasting trend (B=1.5). Fairly good results of the model forecasts are mainly due to the fact that during the four month period considered here, the precipitation occurred over the country is mainly associated with large scale mid latitudes synoptic systems in which large scale advective processes are primarily responsible for producing the precipitation. Though it is a common observation by professional forecasters in Iran that the models are unable to predict the convective small scale precipitations successfully. Examining the results for different precipitation thresholds show that the model performance is different for different thresholds, so that the model results for below 0.1 and also above 10 mm/day thresholds are more accurate when compared with those of the other two thresholds. Results for different regions, show that the model performance for lower precipitation thresholds over fairly drier regions in the south and also for high precipitation thresholds over wetter regions in the north of the country are better. It should be mentioned that heavy precipitations in the south-eastern regions are mainly convective and as mentioned above, lower performance of the model is thus expected. Some of the deficiencies in the results are due to the fact that the high resolution model results are compared against the low density synoptic stations located mainly at low lands. For an extensive verification of the model results it is thus necessary to use a more dense observational network.