admin管理员组

文章数量:1315832

library(SPEI)
library(terra)

Basically, given a monthly time series for two variables a and b as follows:

a = array(1:(3*4*12*64),c(3,4,12*64))
a = rast(a)

dates=seq(as.Date("1950-01-01"), as.Date("2013-12-31"), by="month")
terra::time(a)=dates
names(a) <- zoo::as.yearmon(time(a))

b = array(1:(3*4*12*64),c(3,4,12*64))
b = rast(b)

dates=seq(as.Date("1950-01-01"), as.Date("2013-12-31"), by="month")
terra::time(b) <- dates
names(b) <- zoo::as.yearmon(time(b))

Both a and b are time series with a time dimension. Now, I would like to apply the function SPEI::hargreaves to time series of each pixel in a and b and return a SpatRaster C

From the SPEI package , har <- hargreaves(TMIN,TMAX,lat=37.6475). Here, consider Tmin=a, Tmax=b and lat=latitude of each pixel which is the same in a and b. I will parallelize the process once I get an idea how to apply the function to my SpatRasters.

At the moment, I am using data.table to collapse my rasterbricks and then applying hargreaves to the table. This approach is very inefficient so far.

Here is what I have so far:

library(SPEI)
library(terra)
library(zoo)

har <- function(a, b, lat) {
  SPEI::hargreaves(as.vector(a), as.vector(b), lat,na.rm = TRUE)
} 

lat <- init(rast(a), "y")
PET1 <- terra::lapp(x=c(a, b, lat), fun = Vectorize(har))

Howvever, I get the error:

#Error: [lapp] cannot use 'fun'. The number of values returned is less 
#than the number of input cells.
#Perhaps the function is not properly vectorized

Thanks to @Robert Hijmans' answer, I experimented with terra and raster packages.

raster solution

library(SPEI)
library(raster)
library(zoo)

har <- function(Tmin, Tmax, lat) {
  SPEI::hargreaves(Tmin, Tmax, lat,na.rm = TRUE)
} 

lat <- init(raster(Tmin), "y")

PET_raster <- raster::overlay(Tmin, Tmax, lat, fun = Vectorize(har))

Using actual data, this takes more than 15 minutes to run on my laptop.

PET_raster 
class      : RasterBrick
dimensions : 68, 104, 7072, 1020  (nrow, ncol, ncell, nlayers)
resolution : 0.25, 0.25  (x, y)
extent     : -98.285, -72.285, 39.875, 56.875  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs
source     : r_tmp_2025-01-29_175851.174327_307486_90781.grd
names      :    layer.1,    layer.2,    layer.3,    layer.4,    layer.5,    layer.6,    layer.7,    layer.8,    layer.9,   layer.10,   layer.11,   layer.12,   layer.13,   layer.14,   layer.15, ...
min values :  0.0000000,  0.0000000,  0.0000000, 27.9045500, 40.7257577, 50.3362055, 41.2878256, 49.9874998, 47.6142315, 37.7349603,  7.4571957,  0.0000000,  0.0000000,  0.0000000,  0.0000000, ...
max values :   47.11585,   61.65605,   73.30580,  112.98168,  158.29348,  183.51450,  204.36278,  192.53549,  187.70001,  151.89846,   78.30323,   72.70451,   51.80651,   65.65896,   80.28027, ...

terra solution by @Robert Hijmans

library(SPEI)
library(terra)
library(zoo)


lat <- init(rast(Tmin), "y")

r <- c(Tmin, Tmax, lat)
nl <- nlyr(Tmin)
nl2 <- nl + nl

PET_terra <- app(r, \(i) apply(i, 1, 
                       \(j) SPEI::hargreaves(j[1:nl], j[(nl+1):(nl2)], lat=j[nl2+1], verbose=FALSE)))

Using actual data, this takes less than 15 seconds to run on my laptop. Terrific difference in speed but I doubt which of them (terra vs raster) is correct.

I have more than 10TB of data and will prefer terra in this case.

PET_terra 

class       : SpatRaster
dimensions  : 68, 104, 1020  (nrow, ncol, nlyr)
resolution  : 0.25, 0.25  (x, y)
extent      : -98.285, -72.285, 39.875, 56.875  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)
source(s)   : memory
names       :    lyr.1,    lyr.2,    lyr.3,    lyr.4,     lyr.5,    lyr.6, ...
min values  :  0.00000,  0.00000,  0.00000, 13.75159,  31.91098,  35.7943, ...
max values  : 17.54078, 28.36545, 49.92122, 89.34686, 138.97809, 171.4730, ...

Problem

terra and raster give significantly different min and max values, implying that both packages are not doing the same calculations.

Any thoughts on the differences?

library(SPEI)
library(terra)

Basically, given a monthly time series for two variables a and b as follows:

a = array(1:(3*4*12*64),c(3,4,12*64))
a = rast(a)

dates=seq(as.Date("1950-01-01"), as.Date("2013-12-31"), by="month")
terra::time(a)=dates
names(a) <- zoo::as.yearmon(time(a))

b = array(1:(3*4*12*64),c(3,4,12*64))
b = rast(b)

dates=seq(as.Date("1950-01-01"), as.Date("2013-12-31"), by="month")
terra::time(b) <- dates
names(b) <- zoo::as.yearmon(time(b))

Both a and b are time series with a time dimension. Now, I would like to apply the function SPEI::hargreaves to time series of each pixel in a and b and return a SpatRaster C

From the SPEI package , har <- hargreaves(TMIN,TMAX,lat=37.6475). Here, consider Tmin=a, Tmax=b and lat=latitude of each pixel which is the same in a and b. I will parallelize the process once I get an idea how to apply the function to my SpatRasters.

At the moment, I am using data.table to collapse my rasterbricks and then applying hargreaves to the table. This approach is very inefficient so far.

Here is what I have so far:

library(SPEI)
library(terra)
library(zoo)

har <- function(a, b, lat) {
  SPEI::hargreaves(as.vector(a), as.vector(b), lat,na.rm = TRUE)
} 

lat <- init(rast(a), "y")
PET1 <- terra::lapp(x=c(a, b, lat), fun = Vectorize(har))

Howvever, I get the error:

#Error: [lapp] cannot use 'fun'. The number of values returned is less 
#than the number of input cells.
#Perhaps the function is not properly vectorized

Thanks to @Robert Hijmans' answer, I experimented with terra and raster packages.

raster solution

library(SPEI)
library(raster)
library(zoo)

har <- function(Tmin, Tmax, lat) {
  SPEI::hargreaves(Tmin, Tmax, lat,na.rm = TRUE)
} 

lat <- init(raster(Tmin), "y")

PET_raster <- raster::overlay(Tmin, Tmax, lat, fun = Vectorize(har))

Using actual data, this takes more than 15 minutes to run on my laptop.

PET_raster 
class      : RasterBrick
dimensions : 68, 104, 7072, 1020  (nrow, ncol, ncell, nlayers)
resolution : 0.25, 0.25  (x, y)
extent     : -98.285, -72.285, 39.875, 56.875  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs
source     : r_tmp_2025-01-29_175851.174327_307486_90781.grd
names      :    layer.1,    layer.2,    layer.3,    layer.4,    layer.5,    layer.6,    layer.7,    layer.8,    layer.9,   layer.10,   layer.11,   layer.12,   layer.13,   layer.14,   layer.15, ...
min values :  0.0000000,  0.0000000,  0.0000000, 27.9045500, 40.7257577, 50.3362055, 41.2878256, 49.9874998, 47.6142315, 37.7349603,  7.4571957,  0.0000000,  0.0000000,  0.0000000,  0.0000000, ...
max values :   47.11585,   61.65605,   73.30580,  112.98168,  158.29348,  183.51450,  204.36278,  192.53549,  187.70001,  151.89846,   78.30323,   72.70451,   51.80651,   65.65896,   80.28027, ...

terra solution by @Robert Hijmans

library(SPEI)
library(terra)
library(zoo)


lat <- init(rast(Tmin), "y")

r <- c(Tmin, Tmax, lat)
nl <- nlyr(Tmin)
nl2 <- nl + nl

PET_terra <- app(r, \(i) apply(i, 1, 
                       \(j) SPEI::hargreaves(j[1:nl], j[(nl+1):(nl2)], lat=j[nl2+1], verbose=FALSE)))

Using actual data, this takes less than 15 seconds to run on my laptop. Terrific difference in speed but I doubt which of them (terra vs raster) is correct.

I have more than 10TB of data and will prefer terra in this case.

PET_terra 

class       : SpatRaster
dimensions  : 68, 104, 1020  (nrow, ncol, nlyr)
resolution  : 0.25, 0.25  (x, y)
extent      : -98.285, -72.285, 39.875, 56.875  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)
source(s)   : memory
names       :    lyr.1,    lyr.2,    lyr.3,    lyr.4,     lyr.5,    lyr.6, ...
min values  :  0.00000,  0.00000,  0.00000, 13.75159,  31.91098,  35.7943, ...
max values  : 17.54078, 28.36545, 49.92122, 89.34686, 138.97809, 171.4730, ...

Problem

terra and raster give significantly different min and max values, implying that both packages are not doing the same calculations.

Any thoughts on the differences?

Share Improve this question edited Jan 30 at 17:11 code123 asked Jan 30 at 3:17 code123code123 2,1564 gold badges31 silver badges56 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 2

I thought that would be be possible with the below but that fails and perhaps that can be fixed.

s <- sds(a, a+10, lat)
names(s) <- c("Tmin", "Tmax", "lat")
x <- lapp(s, hargreaves, usenames=TRUE, recycle=TRUE, verbose=FALSE)

The below is ugly, but seems to work:

r <- c(a, a+10, lat)
nl <- nlyr(a)
nl2 <- nl + nl
x <- app(r, \(i) apply(i, 1, 
     \(j) hargreaves(j[1:nl], j[(nl+1):(nl2)], lat=j[nl2+1], verbose=FALSE)))

x
#class       : SpatRaster 
#dimensions  : 3, 4, 768  (nrow, ncol, nlyr)
#resolution  : 1, 1  (x, y)
#extent      : 0, 4, 0, 3  (xmin, xmax, ymin, ymax)
#coord. ref. :  
#source(s)   : memory
#names       :     lyr.1,    lyr.2,    lyr.3,    lyr.4,    lyr.5,    lyr.6, ... 
#min values  :  77.02184, 109.5203, 166.0411, 197.9813, 234.9566, 256.6363, ... 
#max values  : 115.17347, 145.1712, 204.9765, 232.4847, 266.2186, 284.1987, ... 

I think this is correct because

qed <- function(cell) {
  xhg <- as.vector(t(x[cell])[,1])
  Tmin <- unlist(a[cell])
  Tmax <- Tmin + 10
  Lat <- unlist(lat[cell])
  hg <- hargreaves(Tmin, Tmax, lat=Lat, verbose=F)
  all.equal(hg, xhg)
}
qed(5)
#[1] TRUE

本文标签: