In this case we will develop a synoptic classification that goes from environment to circulation. A simple example would be to study the WT associated to daily rainfall equal or greater than 100 mm.

Introduction to the T-mode classification method

The structure of the matrix in T-mode establishes that the variables are the days, and the observations are the grid points. In this case, the PCA establishes the linear relationships between each daily map of the atmospheric variable in order to extract the main WT. The PCA associates directly those days more similar to each other to group the WTs, so it is not necessary to use any clustering method. synoptReg allows to perform three variants of the T-mode based on:

1. Positive and negative (PN) correlations (loadings) of each of the days against each PC. Day 1 may be associated with PC X+ or PC X-, depending on whether the correlation is positive or negative. This last option, allows to obtain the double of classes of the selected components. If 4 PCs are selected, the final result can reach up to a maximum of 8 CTs.

2. Maximum positive correlations (MAX) of each day against each PC. Day 1 will be associated with the PC with a higher positive correlation;

3. Absolute correlations (ABS) of each day against each PC. Day 1 will be associated with the PC with a higher absolute correlation.

See the follwing example:

time PC1 PC2 PC3 PN MAX ABS
2000-01-01 0.7 0.3 -0.4 1+ 1 1
2000-01-02 0.5 0.4 -0.6 3- 1 3
2000-01-03 0.2 0.5 -0.7 3- 2 3
2000-01-04 -0.3 0.5 -0.4 2+ 2 2
2000-01-05 -0.6 0.4 -0.1 1- 2 1

Note, that a VARIMAX rotation is applied to the retained PCs. See Circulation-To-Environment approach vignette for more information about PC rotation.

It is recommended to review the references proposed in the section Method references.

Extracting the torrential rainfall events (>= 100 mm/24h)

In this example we will extract every day that has at least one point of grid with a precipitation >= 100 mm in the Balearic Islands. This can easily be done with the dplyr() library from the tidyverese (Wickham et al., 2019).

library(synoptReg)
library(dplyr)
data(pcp)

torrential_dates <- pcp %>%
mutate(value = value /10) %>% # converting to mm
filter(value >= 100) %>%
distinct(time) %>%
pull()

The result is a vector of 83 dates that reflect those days with a precipitation >= to 100 mm at any point of the Balearic Islands.

Synoptic classification procedure

In order to compute the synoptic classification of these 83 events we need to read previously the amospheric data from the ERA-Interim dataset (Dee et al., 2011), stored in NetCDF files. To read these type of files, synoptReg does not have a specific function. In previous versions it did, but now there are more intuitive and simple alternatives for reading it. In this case, we will use the ReadNetCDF() function included in metR. The variables we will read are: mean sea level pressure (mslp), geopotential height at 500 hPa (z500) and temperature at 850 hPa (t850). Thus, our classification will be made with 3 atmospheric variables that will provide information from different layers of the troposphere (surface and height).

Note that reading with ReadNetCDF() does not return exactly the desired data.frame format. For this reason, a simple manipulation with dplyr() is required:

library(metR)
# listing NetCDF files stored in synoptReg:
slp_file <- system.file("extdata", "mslp_ei.nc", package = "synoptReg")
z500_file <- system.file("extdata", "z500_ei.nc", package = "synoptReg")
t850_file <- system.file("extdata", "t850_ei.nc", package = "synoptReg")

# Reading the data stored in the NetCDFs files as tibbles. it can take a while.
slp <- ReadNetCDF(file = slp_file,vars = "msl",out = "data.frame") %>%
setNames(c("time","lat","lon","value")) %>%
select(lon,lat,time, value) %>% mutate(time = as.Date(time))

z500 <- ReadNetCDF(file = z500_file,vars = "z",out = "data.frame") %>%
setNames(c("time","lat","lon","value")) %>%
select(lon,lat,time, value) %>% mutate(time = as.Date(time))

t850 <- ReadNetCDF(file = t850_file,vars = "t",out = "data.frame") %>%
setNames(c("time","lat","lon","value")) %>%
select(lon,lat,time, value) %>% mutate(time = as.Date(time))

Filtering and grouping the atmospheric variables using the torrential events

Now, we can proceed to: 1) grouping the three atmospheric variables in one unique tibble; 2) Subset the atmospheric variables using the torrential rainfall dates. All this process can be computed by means of tidy_nc(). A list is required to group the atmospheric variables.

vars_torrential <- tidy_nc(x = list(slp,z500,t850),
time_subset = torrential_dates,
name_vars = c("slp","z500","t850"))
vars_torrential
#> # A tibble: 35,607 x 6
#>    var     lon   lat time         value anom_value
#>    <chr> <dbl> <dbl> <date>       <dbl>      <dbl>
#>  1 slp   -15      55 2000-09-04 101343.      25.3
#>  2 slp   -12.5    55 2000-09-04 101353.     -30.0
#>  3 slp   -10      55 2000-09-04 101447.       1.31
#>  4 slp    -7.5    55 2000-09-04 101753.     274.
#>  5 slp    -5      55 2000-09-04 102044.     505.
#>  6 slp    -2.5    55 2000-09-04 102200.     644.
#>  7 slp     0      55 2000-09-04 102296.     713.
#>  8 slp     2.5    55 2000-09-04 102248.     657.
#>  9 slp     5      55 2000-09-04 102092.     496.
#> 10 slp     7.5    55 2000-09-04 101949.     347.
#> # ... with 35,597 more rows

We already have the variables grouped and filtered. In addition, tidy_nc() computes the anomaly of the atmospheric variables using the monthly average of the whole study period before filtering the dates.

Selecting the number of PCs and Synoptic classification performance

As in the case of the circulation-to-environment approach, it is important to take the decision on the number of components retain in order to carry out the synoptic classification. To help in the decision making process, pca_decision() returns a scree test, among other stats:

info <- pca_decision(vars_torrential,ncomp = 30,norm = T,matrix_mode = "T-mode")
info$screeplot info$summary
#>                       1         2         3          4          5          6
#> sdev          5.1185115 3.6034342 3.1823508 2.57912608 2.37691417 2.27236344
#> prop.variance 0.3156525 0.1564426 0.1220163 0.08014327 0.06806893 0.06221248
#> cum.variance  0.3156525 0.4720952 0.5941115 0.67425478 0.74232370 0.80453618
#>                        7          8          9         10         11         12
#> sdev          1.55600120 1.46252779 1.25624418 1.14552095 1.03678028 0.96519519
#> prop.variance 0.02917036 0.02577093 0.01901385 0.01580986 0.01295076 0.01122412
#> cum.variance  0.83370654 0.85947747 0.87849132 0.89430118 0.90725194 0.91847606
#>                        13          14          15          16          17
#> sdev          0.868246644 0.792538727 0.762162574 0.727480937 0.697695736
#> prop.variance 0.009082557 0.007567682 0.006998696 0.006376247 0.005864811
#> cum.variance  0.927558617 0.935126299 0.942124996 0.948501243 0.954366054
#>                        18          19          20          21          22
#> sdev          0.565083481 0.557921056 0.517389968 0.502554709 0.465367037
#> prop.variance 0.003847221 0.003750312 0.003225209 0.003042906 0.002609235
#> cum.variance  0.958213275 0.961963587 0.965188796 0.968231703 0.970840938
#>                        23          24          25         26          27
#> sdev          0.405150685 0.397725264 0.389126510 0.37425815 0.367115122
#> prop.variance 0.001977676 0.001905848 0.001824331 0.00168758 0.001623777
#> cum.variance  0.972818613 0.974724461 0.976548792 0.97823637 0.979860149
#>                        28          29         30
#> sdev          0.339549481 0.334820645 0.33009835
#> prop.variance 0.001389083 0.001350661 0.00131283
#> cum.variance  0.981249232 0.982599893 0.98391272

By means of the Scree plot, it seems reasonable to retain 5 PCs. These 5 PCs explain the 82% of the variance. With this information, we can already perform our synoptic classification, specifying that the matrix mode will be "T-mode":

cl <- synoptclas(vars_torrential,ncomp = 6,norm = T,matrix_mode = "T-mode")
#> Warning: The x argument of as_tibble.matrix() must have unique column names if .name_repair is omitted as of tibble 2.0.0.
#> Using compatibility .name_repair.
cl # list object containing the classification and the classified grid
#> $clas_abs #> # A tibble: 83 x 2 #> time WT #> <date> <dbl> #> 1 2000-09-04 5 #> 2 2000-10-11 2 #> 3 2000-10-22 3 #> 4 2001-02-18 1 #> 5 2001-07-16 2 #> 6 2001-09-06 5 #> 7 2001-11-10 2 #> 8 2001-11-11 2 #> 9 2001-11-14 2 #> 10 2001-11-15 2 #> # ... with 73 more rows #> #>$clas_max
#> # A tibble: 83 x 2
#>    time          WT
#>    <date>     <dbl>
#>  1 2000-09-04     5
#>  2 2000-10-11     1
#>  3 2000-10-22     4
#>  4 2001-02-18     2
#>  5 2001-07-16     5
#>  6 2001-09-06     5
#>  7 2001-11-10     2
#>  8 2001-11-11     2
#>  9 2001-11-14     2
#> 10 2001-11-15     2
#> # ... with 73 more rows
#>
#> $clas_pn #> # A tibble: 83 x 2 #> time WT #> <date> <chr> #> 1 2000-09-04 5+ #> 2 2000-10-11 2- #> 3 2000-10-22 3- #> 4 2001-02-18 1- #> 5 2001-07-16 2- #> 6 2001-09-06 5+ #> 7 2001-11-10 2+ #> 8 2001-11-11 2+ #> 9 2001-11-14 2+ #> 10 2001-11-15 2+ #> # ... with 73 more rows #> #>$grid_clas_abs
#> # A tibble: 2,574 x 9
#>    var     lon   lat time          WT mean_WT_value mean_WT_anom_val~ pval_ttest
#>    <chr> <dbl> <dbl> <date>     <dbl>         <dbl>             <dbl>      <dbl>
#>  1 slp   -15      55 2000-09-04     5       101889.              774.   0.0212
#>  2 slp   -12.5    55 2000-09-04     5       102011.              848.   0.00777
#>  3 slp   -10      55 2000-09-04     5       102119.              897.   0.00211
#>  4 slp    -7.5    55 2000-09-04     5       102178.              915.   0.000765
#>  5 slp    -5      55 2000-09-04     5       102205.              884.   0.000677
#>  6 slp    -2.5    55 2000-09-04     5       102176.              833.   0.00113
#>  7 slp     0      55 2000-09-04     5       102170.              795.   0.00138
#>  8 slp     2.5    55 2000-09-04     5       102144.              743.   0.00194
#>  9 slp     5      55 2000-09-04     5       102099.              674.   0.00292
#> 10 slp     7.5    55 2000-09-04     5       102034.              584.   0.00497
#> # ... with 2,564 more rows, and 1 more variable: cv_WT_value <dbl>
#>
#> $grid_clas_max #> # A tibble: 2,574 x 9 #> var lon lat time WT mean_WT_value mean_WT_anom_val~ pval_ttest #> <chr> <dbl> <dbl> <date> <dbl> <dbl> <dbl> <dbl> #> 1 slp -15 55 2000-09-04 5 102091. 910. 0.00239 #> 2 slp -12.5 55 2000-09-04 5 102140. 910. 0.000816 #> 3 slp -10 55 2000-09-04 5 102155. 871. 0.000382 #> 4 slp -7.5 55 2000-09-04 5 102132. 816. 0.000185 #> 5 slp -5 55 2000-09-04 5 102102. 728. 0.000177 #> 6 slp -2.5 55 2000-09-04 5 102056. 661. 0.000298 #> 7 slp 0 55 2000-09-04 5 102030. 600. 0.000738 #> 8 slp 2.5 55 2000-09-04 5 101982. 530. 0.00277 #> 9 slp 5 55 2000-09-04 5 101921. 451. 0.0109 #> 10 slp 7.5 55 2000-09-04 5 101840. 354. 0.0482 #> # ... with 2,564 more rows, and 1 more variable: cv_WT_value <dbl> #> #>$grid_clas_pn
#> # A tibble: 5,148 x 8
#>    var     lon   lat time       WT    mean_WT_value mean_WT_anom_va~ cv_WT_value
#>    <chr> <dbl> <dbl> <date>     <chr>         <dbl>            <dbl>       <dbl>
#>  1 slp   -15      55 2000-09-04 5+          102523.            1271.       0.601
#>  2 slp   -12.5    55 2000-09-04 5+          102497.            1197.       0.620
#>  3 slp   -10      55 2000-09-04 5+          102449.            1098.       0.646
#>  4 slp    -7.5    55 2000-09-04 5+          102387.            1013.       0.607
#>  5 slp    -5      55 2000-09-04 5+          102322.             892.       0.616
#>  6 slp    -2.5    55 2000-09-04 5+          102230.             784.       0.636
#>  7 slp     0      55 2000-09-04 5+          102158.             676.       0.706
#>  8 slp     2.5    55 2000-09-04 5+          102047.             550.       0.757
#>  9 slp     5      55 2000-09-04 5+          101921.             413.       0.757
#> 10 slp     7.5    55 2000-09-04 5+          101788.             274.       0.725
#> # ... with 5,138 more rows

The object returned by synoptclas differs relatively from the object returned when we apply a classification in "S-mode". This time, the output is a list with 4 objects: 2 objects are tibbles with the variables time and WT. The difference between the two objects lies in the grouping method applied to achieve the WTs. In clas_abs the association is done with the absolute correlations;In clas_max the association is done with the positive correlations; finally, in clas_pn the extreme positive and negative values are used to get the final groups. The three remaining objects are the grids classified with each of these methods. In this case, we will work with the PN approach. Let’s see the results:

library(ggplot2)
library(dplyr)
library(rnaturalearth)
library(metR)

borders <- ne_countries(continent = c("europe","africa"),
returnclass = "sf",scale = 50)

ggplot()+
geom_contour_fill(filter(cl$grid_clas_pn, var == "z500"), mapping = aes(lon,lat,z = mean_WT_anom_value/9.8), # to m binwidth = 20)+ geom_sf(data = borders, fill = "transparent")+ geom_contour2(data = filter(cl$grid_clas_pn,var == "slp"),
aes(x=lon,y=lat,z=mean_WT_value/100),
binwidth = 4, color = "black") +
geom_text_contour(data= filter(cl$grid_clas_pn, var == "slp"), aes(x=lon,y=lat,z=mean_WT_value/100), stroke = 0.15,binwidth = 4) + geom_contour2(data = filter(cl$grid_clas_pn,var == "t850"),
aes(x=lon,y=lat,z=mean_WT_anom_value),
binwidth = 2, color = "black", linetype = "dashed") +
geom_text_contour(data= filter(cl$grid_clas_pn, var == "t850"), aes(x=lon,y=lat,z=mean_WT_anom_value), stroke = 0.15,binwidth = 2) + guides(fill = guide_colourbar(barwidth = 9, barheight = 0.5))+ facet_wrap(~WT, ncol = 4) + scale_fill_divergent(name = "Z anomaly (m)") + scale_x_continuous(limits = c(-15,15), expand = c(0,0))+ scale_y_continuous(limits = c(30,55), expand = c(0,0))+ theme_bw() + theme( panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), text = element_text(size = 10), strip.background = element_rect(fill = "transparent", color = NA), axis.title = element_blank(), axis.text = element_blank(), axis.ticks = element_blank(), legend.position = "bottom") Analysing the spatial contribution of each WT to torrential rainfall in the Balearic Islands We can then calculate the maximum precipitation associated with these CT/WT. To do this we will use the function ct2env() indicating in the argument fun the function max: max_torrential_pcp <- ct2env(pcp,clas = cl$clas_pn,fun = max,out = "data.frame")

ggplot() +
geom_tile(data=max_torrential_pcp, aes(x=lon, y=lat,fill=calc/10)) +
name = "rainfall (mm/day)",na.value="transparent") +
coord_equal()+
scale_colour_gradient(guide = 'none') + facet_wrap(~WT, ncol = 5) +
guides(fill = guide_colourbar(barwidth = 9, barheight = 0.5))+
theme_bw() + theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
legend.position = "bottom")

In addition, we can assess the spatial variability of this torrential events by creating the function of the coefficient of variation (cv) in the fun argument:

cv_torrential_pcp <- ct2env(pcp,
clas = cl\$clas_pn,
fun = function(x){sd(x)/mean(x)*100},
out = "data.frame")

ggplot() +
geom_tile(data=cv_torrential_pcp, aes(x=lon, y=lat,fill=calc/10)) +
name = "coeff. var (%)",na.value="transparent") +
coord_equal()+
scale_colour_gradient(guide = 'none') + facet_wrap(~WT, ncol = 5) +
guides(fill = guide_colourbar(barwidth = 9, barheight = 0.5))+
theme_bw() + theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
legend.position = "bottom")

Method references

Synoptic classification method: