To assess the photovoltaic (PV) energy yield potential of a site, we run models using best available data and methods. The result of the modelling is the P50 estimate, or in other words, the “best estimate”. P50 is essentially a statistical level of confidence suggesting that we expect that the predicted solar resource/energy yield may be exceeded with 50% probability. This also means that with at same probability the expectation may not be achieved.

P50 level of confidence may represent too high risk for some investors. Therefore, other probabilities of exceedance such as P90 (estimate exceeded with 90% probability) or P75 (estimate exceeded 75% of the time) are considered. Lenders and investors typically use P90 estimates to be confident that sufficient energy is generated, allowing to safely repay the project debt.

In solar energy, distribution of uncertainty does not perfectly follow normal distribution. Yet for the sake of simplified calculations, and also because statistically representative data is not always available, a concept of normal (Gaussian) distribution of uncertainty is used (bell-shaped curve, see Figure 1). P50 value is the center/mean, and it represents the estimate that occurs with the highest probability.

The P90 value is a lower value, and it is expected to be exceeded in 90% of the cases (Figure 2). The P75 value is a value higher than P90 (and lower than P50), and it is expected to be exceeded in 75% of the cases. Similarly, any Pxx exceedance level can be defined (Figures 2 and 3).

Figure 2: P90 value represented in a normal distribution

Figure 3: P50, P75, P90 and P99 value represented in a normal distribution

**P50 is the most probable value, also called best estimate, and it can be exceeded with 50% probability. P90 is to be exceeded with 90% probability, and it is considered as a conservative estimate.**

All Pxx values are constructed by knowing (i) the best estimate or P50 (the value calculated by the models or measured by solar sensor) and (ii) the value of total uncertainty associated with this estimate. There is nothing what we could call P50 uncertainty: P50 is the best estimate and there is a level of uncertainty associated to it, which in turn can be used for calculation of exceedance values at different confidence levels, all of them based on the same probability distribution of values.

The calculation of Pxx scenarios from the P50 estimate takes into account the total uncertainty that summarizes all factors involved in the PV energy yield modelling. For valid characterization of long term climate patterns, solar resource and meteorological data representing at least 10 years is required.

In the following text we will consider evaluation of uncertainty of annual (yearly) values. The following sources of uncertainty are to be considered in evaluating a total uncertainty:

**Uncertainty of models.**The standard data deliveries include information about the model uncertainty referring to yearly**GHI estimates**. The general uncertainty information is provided in PDF data reports, and on request it can be more accurately specified with regard to the region of interest. The model uncertainty already includes the uncertainties related to the measurements used for the model validation. In PV energy calculation, the GTI values are used, and the model**converting GHI to GTI**also contributes to the total uncertainty.**Interannual variability.**Weather changes year-by-year, in longer-term cycles and has also stochastic nature. Therefore, solar radiation, air temperature and PV energy yield in each year can deviate from the long-term average to some extent, and this is called interannual variability. It can be calculated from the historical time series as a standard deviation of the series of annual values. If the interannual variability for a period of*N*years is being considered, then the STDEV is to be divided by the square root of*N*(typically one year, 10 years, or the total expected lifetime of the solar energy asset). For single year this uncertainty is highest, and it decreases with number of years. In P90 energy calculation, the case of variability that can be expected at any single year is typically assumed. On request, calculation of variability over longer period (10, 20 or 25 years) is also provided. Optimally, interannual variability of**PV power production**is calculated from full historical time series. In case that TMY data is used this is not possible and therefore a less accurate assumption of**GHI**variability is applied.**Uncertainty of energy simulation model**. This considers the imperfections of PV energy simulation models, which provide values of expected energy yield. Various uncertainty factors affecting PV energy production (e.g. soiling losses, availability, etc.) should be included as well, and often, these are the major sources of uncertainty in simulation models.

**The final P90 (Pxx) is obtained by combining P50 with all factors of uncertainty expressed for the same exceedance level**

It is quite common to see the uncertainty expressed in terms of standard deviation (STDEV), which represents a confidence interval equivalent to approximately 68.27% of occurrence (84% probability of exceedance). Simplified assumption of the normal distribution, the uncertainty at P90 can be calculated simply by multiplying standard deviation by 1.282, resulting in a slightly higher number calculated from the same cumulative probability curve (Figure 4).

Figure 4: Uncertainty intervals, expressed at standard deviation and 80% confidence levels (P90 exceedance)

To round the values up or down to the desired figure (P90, P75 or similar) we can convert STDEV into any Pxx value based on the Gaussian distribution formulas (Table 1).

Table 1: Calculation of different PXX from a normal distribution of probability.

Obtaining the Pxx value from P50 estimate is quite straightforward if the uncertainty has been correctly calculated, as shown in Table 2.

Table 2: Calculation of different Pxx exceedance values for a normal distribution of probability.

As mentioned before, uncertainty is composed of several factors, so one thing we should keep in mind is working at the same exceedance level when combining them. In Solargis, the standard uncertainty estimates are provided at P90 level of exceedance.

The uncertainty sources are independent of each other and all the contributing factors are combined in a total uncertainty U_{total }in a quadratic sum:

For calculating TMY P90, we take as a reference P90 values of solar resource (GHI and DNI; the weighting depends on the type of TMY and geographical location).The yearly P90 value is calculated as shown in Table 2. P90 uncertainty for solar parameters represents the total uncertainty, it is calculated as shown in Equation 1, where two sources of uncertainty are considered: uncertainty of the solar model and interannual variability for any single year.

Solargis offers 3 type of hourly datasets that can be used for simulation of expected energy output for P50, P90, and other Pxx scenarios. Description and sample data files for each data type is given below:

**Historical time series**comprises the whole time period available (data from year 1994/1999/2007 to the present time). If expressed in hourly intervals, it has 8760 values per each year (8784 value for the leap years) of data available. The sample dataset below has more than 200,000 values for each parameter.

Download sample data file for hourly time series (CSV, 14.1 MB)

**TMY P50**dataset represents, for each month, the most typical (average) climate conditions, and the most representative distribution of hourly values for the key parameters, referring to historical time series. It is constructed by concatenation of ‘typical’ months. If expressed in hourly aggregation, the full historical time series data file is in TMY finally compressed to 8760 ‘typical’ values. The benefit of TMY is size of the data file allowing faster speed of calculation. The disadvantage is the loss of various (less typical) weather patterns.

Download sample data file for TMY P50 (CSV, 0.5 MB)

**TMY P90**dataset represents a year, which is close to the P90 annual value, and it characterizes a type of year with below-average solar resource conditions (higher occurrence of cloudy weather and higher concentration of atmospheric aerosols) and lower temperature. In a simplified way, it can be considered that it represents a year that can occur once in 10 years. Thus, it is suitable for simulation of conservative PV energy yield scenarios. This dataset is generated by concatenating months representing lower summaries of solar radiation so that the annual value is close to P90 (taking into account a combined effect of the solar model uncertainty and GHI interannual variability that can be observed at any single year). If expressed in hourly intervals, the information content present in historical time series is also finally compressed to 8760 values.

Download sample data file for TMY P90 (CSV, 0.5 MB)

From the description above it is clear that in the best case **full historical time series **data should be used so that all types of weather patterns are represented in the energy simulation. Yet a typical practice in solar energy industry is to use **TMY P50 **data, representing ‘standard’ year. This is partially due to the speed and efficiency of energy simulation. The other reason also is that current PV energy simulation software has very limited or no possibilities to use full time series. **TMY P90 **data type is also widely used as it offers a comfortable and, to a great extent, standardised solution to work with a year that represent ‘conservative’ (suboptimal) weather conditions. The important benefit of using TMY P90, as add-on to TMY P50, is that it includes some of the hourly data patterns that may indicate critical weather conditions.

Depending on the dataset chosen in PV energy simulation for P90 (Pxx) level of confidence, the uncertainty factors should be applied in slightly different order and hence the simulation results will differ. The differences are in the approach differences are described in Table 3.

Table 3: Uncertainties that should be considered when using different Solargis datasets when running a PV energy

Steps to be taken for estimate of P90 annual PV energy yield when using three different data steps are described below.

**Calculating PVOUT P90 annual value from full historical time series**

- Calculate PVOUT for P50 from time series
- Consider uncertainty of GHI model estimate
- Consider uncertainty of the model transposing GHI to GTI
- Consider uncertainty due to variability of yearly PVOUT values (interannual variability)
- Consider uncertainties occurring during PV simulation steps
- Calculate total uncertainty of Steps 2 to 5 (Equation 1)
- Calculate annual value of PVOUT for P90 case from P50 value (Step 1) and total uncertainty (Step 6) using equation shown in Table 2.

**Calculating PVOUT P90 annual value from TMY P50 data set**

- Calculate PVOUT for P50 from TMY P50 data set
- Consider uncertainty of GHI model estimate
- Consider uncertainty due to variability of yearly GHI values (interannual variability)
- Consider uncertainty of the model transposing GHI to GTI
- Consider uncertainties occurring during PV simulation steps
- Calculate total uncertainty of Steps 2 to 5 (Equation 1)
- Calculate annual value of PVOUT for P90 case from P50 value (Step 1) and total uncertainty (Step 6) using equation shown in Table 2.

**Calculating PVOUT P90 annual value from TMY P90 data set**

- Calculate PVOUT from TMY P90
- Consider uncertainty of the model transposing GHI to GTI
- Consider uncertainties occurring during PV simulation steps
- Calculate total uncertainty of Steps 2 and 3
- Calculate annual value of PVOUT for P90 case from P50 value (Step 1) and total uncertainty (Step 4) using equation shown in Table 2.

**Notes:**

- Calculation based on the use of time series makes it possible to estimate more accurately the interannual variability: by calculating it directly from PVOUT values. This is not possible when using TMY P50, where variability of GHI yearly values can only be considered.
- Uncertainties of GHI model and GHI interannual variability are already included in the calculation of TMY P90 data set, therefore they are not considered in the calculation of PVOUT value for P90.

Simulation results for the sample of Almeria (Spain) are presented in Table 4: for full historical time series, TMY P50 and TMY P90. The selection of months calculated as the outcome of the TMY algorithm is shown in the column ‘Month: Year’.

Table 4: Summary of GHI and PVOUT values obtained for a sample site in Plataforma Solar de Almeria, Spain.

For the sample considered in this article, the results of applying the uncertainties for each dataset are presented in the Table 5. In comparison to using time series for the simulation (most accurate and complete approach), for this particular site using TMY P50 for the simulation resulted in 1% overestimation of P90 energy value, while using TMY P90 dataset resulted in 4% underestimation of P90 energy value. These deviations are related to the assumptions taken when calculating the interannual variability on the one hand, and the loss of information related to TMY generation on the other hand. This exercise was done as an example, and the obtained results may not show the same trend for other locations.

Table 5: How to calculate PV energy yield value for P90 using different data sets for the sample site considered.

**Notes**

- Solargis weather data has been used for the calculations (period 1994-2016, climate database Solargis v2.1.19).
- Simulation run using Solargis methodologies, considering a 1 kWp system with cSi technology, inverter efficiency 97.5%, DC losses 2.5%, AC losses 1.5% and relative row spacing 2.5.
- Production values for the first year of operation, no degradation factor considered in the calculations.
- Location: Plataforma Solar de AlmeríaLatitude: 37.094416°, Longitude: -2.35985°.
- Model uncertainty provided by Solargis: ±3.5%.
- PV simulation uncertainty considered for the calculation: ±5%
- All values expressed at P90 confidence interval (STDEV*1.282).
- TMY calculated using Solargis method of the concatenation of selected ‘typical months’, including final adjusting of annual GHI to Time Series average.
- Interannual variability calculated for 1 year.

- https://solargis.com/products/time-series-and-tmy-data/
- https://solargis.com/docs/methodology/
- https://solargis.com/docs/accuracy-and-comparisons/
- Suri M., Cebecauer T., 2014. Satellite-based solar resource data: Model validation statistics versus user’s uncertainty. ASES SOLAR 2014 Conference, San Francisco. Available at https://solargis2-web-assets.s3.eu-west-1.amazonaws.com/public/publication/2014/1f0b376723/Suri-Cebecauer-ASES-Solar2014-Satellite-Based-Solar-Resource-Data-Model-Validation-Statistics-Versus-User-Uncertainty.pdf
- Cebecauer T., Suri M., 2015. Typical Meteorological Year Data: Solargis Approach. Energy Procedia 69, 1958-1969. Available at https://doi.org/10.1016/j.egypro.2015.03.195