Difference between revisions of "GCM data from the IPCC data repository"

From SourceWiki
Jump to navigation Jump to search
 
Line 1: Line 1:
You can download [http://www.ipcc-data.org/ar4/gcm_data.html summary GCM data] from the [http://www.ipcc-data.org IPCC data centre] for free for research purposes. They come in NetCDF format, which you can read into R with the RNetCDF package. Here is an example of how to read the data, extract a region of interest and convert the result to a SpatialGridDataFrame:
+
You can download [http://www.ipcc-data.org/ar4/gcm_data.html summary GCM data] from the [http://www.ipcc-data.org IPCC data centre] for free for research purposes. They come in NetCDF format, which you can read into R with the RNetCDF package. However, the GCMs run at different resolutions, which  complicates the combination of them to calculate characteristics such as the predicted mean and range. The best method is to interpolate the maps to a common grid for the region of interest, and then calculate the statistics for each cell. The interpolation methods used here is the nearest neighbour.
  
region we are interested in (in degree_north and degree_east):
+
Note: some IPCC maps do not come in a perfectly regular grid. Gridcells are often smaller at the boundaries, and for some reason the grid centers are not completely aligned. This creates problems for making a SpatialGridDataFrame. Interpolating to a self-defined regular grid avoids this issue.
    region <- data.frame(N = 12.45, E = 290, S = -5.23, W = 278)
 
  
 +
First load some required packages:
 
     library(RNetCDF)
 
     library(RNetCDF)
 
     library(sp)
 
     library(sp)
  
     nc <- open.nc("HADCM3_SRA1B_1_pr-change_2080-2099.nc")
+
Define the grid we are interested in. The example is for the northern Andes region in South America:
    subset <- find.subregion(nc,region)
+
     grid <- GridTopology(c(-81.75,-5.25), c(0.5,0.5), c(24,36))
     lat <- var.get.nc(nc,"latitude")
+
 
 +
Open the NetCDF file and get the required variables. Here we are interested in the air temperature.
 +
     lat <- var.get.nc(nc,"latitude")
 
     long <- var.get.nc(nc,"longitude")
 
     long <- var.get.nc(nc,"longitude")
 +
    data <- var.get.nc(nc, "air_temperature_anomaly")
  
make a grid for the region of interest
+
To make a SpatialPointsDataFrame we need a dataframe where each row consists of an entry of the form (x,y,a) where x and y are the coordinates and a is the value. So we need to reformat the lat, long and data objects to be in the right format, We also average the temperature to yearly values by taking the mean over the 12 months.
(the resolution is given manually as it is not constant in the NetCDF files)
 
    grid <- GridTopology(c(long[subset$W],lat[subset$S]), c(3.75,2.5), c((subset$E-subset$W+1),(subset$N-subset$S+1)))
 
  
extract the relevant data from the nc connection and take subset
+
    lat  <- rep(lat,length(long))
     prec <- var.get.nc(nc, "precipitation_flux")
+
    long <- rep(long, length(lat))
     HADCM3_prec_rel <- prec[subset$E:subset$W,subset$S:subset$N,]
+
    long <- as.vector(t(matrix(longx,nrow=length(long))))
 +
     data <- data.frame(var = as.vector(t(apply(var,c(1,2),"mean"))))
 +
     map <- SpatialPointsDataFrame(cbind(long,lat), data, proj4string = CRS("+proj=latlong +datum=WGS84"))
  
averaging over all months:
+
Now interpolate using the closest grid cell of the original predictions. First we need to convert the grid to a SpatialGridDataFrame, which we initialise with NA, and then we look for the closest gridcell in map using spDistsN1().
    HADCM3_prec_rel_y <- as.data.frame(as.vector(apply(HADCM3_prec_rel,c(1,2),"mean")))
 
  
converting to SpatialGridDataFrame
+
    data <- data.frame(temperature = rep(NA,length(coordinates(grid)[,1])))
     HADCM3_prec_rel_y <- SpatialGridDataFrame(grid, HADCM3_prec_rel_y, proj4string = CRS("+proj=longlat +datum=WGS84"))
+
     grid <- SpatialGridDataFrame(grid, data, proj4string = CRS("+proj=longlat +datum=WGS84"))
 +
    GridPoints<- coordinates(grid)
 +
    MapPoints <- coordinates(map)
 +
    for(i in 1:dim(GridPoints)[1]) {
 +
        distance <- spDistsN1(MapPoints, GridPoints[i,], longlat = TRUE)
 +
        grid$var[i] <- map$var[which.min(distance)]
 +
    }
  
Here is the function that extracts the index limits of the smallest subgrid that covers the region of interest:
+
Finally we can plot the map:
  
     find.subregion <- function(nc,region) {
+
     spplot(grid, scale=list(draw=T))
        subset <- data.frame(N = NA, E = NA, S = NA, W = NA)
 
        bounds <- var.get.nc(nc, "latitude_bounds")
 
        # search for N limit:
 
            dummy <- bounds[2,] - region$N
 
            dummy[dummy < 0] <- 1000
 
            subset$N <- which.min(dummy)
 
        # search for S limit:
 
            dummy <- bounds[2,] - region$S
 
            dummy[dummy < 0] <- 1000
 
            subset$S <- which.min(dummy)
 
        bounds <- var.get.nc(nc, "longitude_bounds")
 
        # search for E limit:
 
            dummy <- bounds[2,] - region$E
 
            dummy[dummy < 0] <- 1000
 
            subset$E <- which.min(dummy)
 
        # search for W limit:
 
            dummy <- bounds[2,] - region$W
 
            dummy[dummy < 0] <- 1000
 
            subset$W <- which.min(dummy)
 
        return(subset)
 
    }
 

Revision as of 13:49, 2 April 2008

You can download summary GCM data from the IPCC data centre for free for research purposes. They come in NetCDF format, which you can read into R with the RNetCDF package. However, the GCMs run at different resolutions, which complicates the combination of them to calculate characteristics such as the predicted mean and range. The best method is to interpolate the maps to a common grid for the region of interest, and then calculate the statistics for each cell. The interpolation methods used here is the nearest neighbour.

Note: some IPCC maps do not come in a perfectly regular grid. Gridcells are often smaller at the boundaries, and for some reason the grid centers are not completely aligned. This creates problems for making a SpatialGridDataFrame. Interpolating to a self-defined regular grid avoids this issue.

First load some required packages:

   library(RNetCDF)
   library(sp)

Define the grid we are interested in. The example is for the northern Andes region in South America:

   grid <- GridTopology(c(-81.75,-5.25), c(0.5,0.5), c(24,36))

Open the NetCDF file and get the required variables. Here we are interested in the air temperature.

   lat <- var.get.nc(nc,"latitude")
   long <- var.get.nc(nc,"longitude")
   data <- var.get.nc(nc, "air_temperature_anomaly")

To make a SpatialPointsDataFrame we need a dataframe where each row consists of an entry of the form (x,y,a) where x and y are the coordinates and a is the value. So we need to reformat the lat, long and data objects to be in the right format, We also average the temperature to yearly values by taking the mean over the 12 months.

   lat  <- rep(lat,length(long))
   long <- rep(long, length(lat))
   long <- as.vector(t(matrix(longx,nrow=length(long))))
   data <- data.frame(var  = as.vector(t(apply(var,c(1,2),"mean"))))
   map <- SpatialPointsDataFrame(cbind(long,lat), data, proj4string = CRS("+proj=latlong +datum=WGS84"))

Now interpolate using the closest grid cell of the original predictions. First we need to convert the grid to a SpatialGridDataFrame, which we initialise with NA, and then we look for the closest gridcell in map using spDistsN1().

   data <- data.frame(temperature = rep(NA,length(coordinates(grid)[,1])))
   grid <- SpatialGridDataFrame(grid, data, proj4string = CRS("+proj=longlat +datum=WGS84"))
   GridPoints<- coordinates(grid)
   MapPoints <- coordinates(map)
   for(i in 1:dim(GridPoints)[1]) {
       distance <- spDistsN1(MapPoints, GridPoints[i,], longlat = TRUE)
       grid$var[i] <- map$var[which.min(distance)]
   }

Finally we can plot the map:

   spplot(grid, scale=list(draw=T))