After calculating model statistics by comparing Solargis with good quality ground measurements in more than 200 sites across all type of climates the following has been observed:
An analysis on the distribution of the bias across different geographies and situations lead us to the following conclusions (summary in the table below):
Based on the validation of Solargis data, a location specific uncertainty estimate can be derived on a casebycase basis by looking at the model performance after analysing the local climatic and geographic features.
Please find a summary of the accuracy of solar radiation data from Solargis database in the table below. A detailed description and public validation statistics in this PDF document (1.8 MB).
Other publications can be found here. In particular, more information about sources of uncertainty and validation can be found in the sections 2.8 and 2.9 of this scientific book chapter.

GHI 
DNI 
Description 
Number of validation sites 
208 
143 
 
Number of public sites 
163 
102 
 
Mean Bias for all sites 
0% 
1.7% 
Tendency to overestimate or to underestimate the measured values, on average 
Standard deviation 
±2.9% 
±5.8% 
Indicator of the range of deviation of the model estimates for the validation sites 
Expected range of bias outside validation sites (P90 uncertainty) 
±4% to ±8% 
±8% to ±12% 
Depends on specific analysis on geography and availability of ground measurements 
The performance of satellitebased models for a given site is characterized by the following indicators, which are calculated for each site for which comparisons with good quality ground measurements are available:
Typically, bias is considered as the first indicator of the model accuracy, however the interpretation of the model accuracy should be done analysing all measures. While knowing bias helps to understand a possible error of the longterm estimate, MAD and RMSD are important for estimating the accuracy of energy simulation and operational calculations (monitoring, forecasting). Usually validation statistics are normalized and expressed in percentage (e.g. rMBD is used for relative Mean Bias Deviation).
Other indicators can be calculated as well, like KolmogorovSmirnoff Index (KSI), which characterizes representativeness of distribution of values. It may indicate issues in the model’s ability to represent various solar radiation conditions. KSI is important for accurate CSP modelling, as the response of these systems is nonlinear to irradiance levels. Even if bias of different satellitebased models is similar, other accuracy characteristics (RMSD, MAD and KSI) may indicate substantial differences in their performance.
Validation statistics for one site do not provide representative picture of the model performance in the given geographical conditions. This can be explained by the fact that such site may be affected by a local microclimate or by hidden issues in the groundmeasured data.
Therefore, the ability of the model to characterize longterm annual GHI and DNI values should be evaluated at a sufficient number of validation sites. Good satellite models are consistent in space and time, and thus the validation at several sites within one geography provides a robust indication of the model accuracy in geographically comparable regions elsewhere.
As of today Solargis model has been validated at more than 200 sites worldwide. Although the number of reference stations is increasing with time, availability of high quality ground measurements for comparison is limited for some regions. In this case, if a number of validation sites within a specific geography shows bias and RMSD consistently within certain range of values, one can assume that the model will behave consistently also in regions with similar geography where validation sites are not available.
The accuracy of the model can be calculated provided that the absolute majority of the validation data have been collected using highaccuracy instruments, applying the best measurement practices and strict quality control procedures.
If we want to characterize the bias in general for sites out of the validation locations, we can take the simplified assumption of having a normal distribution of deviations between the model and the measured values for model estimates. When describing the normal distribution curve the following facts can be observed:
As with any other measuring approaches, users cannot expect zero uncertainty for satellitebased solar models. However, if the physics represented by the algorithms is correctly implemented, one can expect robust and uniform behavior of the model for the geographical conditions, for which it has been calibrated and validated.
Even though distribution of validation sites is irregular, a stable and predictable performance of Solargis is observed across various climate regions. A complete list with the publicly available validation sites and statistics can be found in the annex of this PDF document (1.8 MB).
For a practical use, the statistical measures of accuracy had to be converted into uncertainty, which better characterizes probabilistic nature of a possible error of the model estimate. One way of evaluating the uncertainty is to apply confidence intervals for estimating its probabilistic nature. When assuming normal distribution, statistically one standard deviation characterizes 68% probability of occurrence. From the standard deviation, other confidence intervals can be constructed:

Probability of occurrence 
Formula 
One standard deviation 
68.3% 
± STDEV 
Two standard deviations 
95.5% 
± 2*STDEV 
Three standard deviations 
99.7% 
± 3*STDEV 
P75 uncertainty 
50% 
± 0.675*STDEV 
P90 uncertainty 
80% 
± 1.282*STDEV 
P95 uncertainty 
90% 
± 1.645*STDEV 
P97.5 uncertainty 
95% 
± 1.960*STDEV 
P99 uncertainty 
98% 
± 2.326*STDEV 
From confidence intervals we can calculate different probability scenarios. The P50 value will be the most expected value (center of the probability density curve), from which various levels of confidence can be expressed. For instance, in solar resource assessment the P90 value has become a standard and it represents a number that would be exceeded in 90% of the cases.

Probabilily of exceedance 
Probabilily of nonexceedance 
Formula 
P50 value 
50% 
50% 
Mean 
P75 value 
75% 
25% 
Mean  0.675*STDEV 
P90 value 
90% 
10% 
Mean  1.282*STDEV 
P95 value 
95% 
5% 
Mean  1.645*STDEV 
P97.5 value 
97.5% 
2.5% 
Mean  1.960*STDEV 
P99 value 
99% 
1% 
Mean  2.326*STDEV 