admin管理员组

文章数量:1122832

I have been processing some Sentinel 2 data using the "terra" and "RStoolbox" packages of R. I was trying to color balance two tiles of Sentinel 2 imagery using the histMatch() function, but got some very unexpected results...the maximum values for the bands histogram matched image were sometime less than the minimum values for the corresponding band in the reference image. I reported this behavior as a bug on GitHub as issue #117 on the RStoolbox repository. I dug into the source code and was able to isolate the issue. Essentially, histMatch() finds a function for each band in the reference raster that finds the raw value corresponding to a particular quantile in a cumulative distribution function. It then creates a list of functions and attempts to apply them sequentially to each corresponding band in the raster to be process. I pasted in the snippet of code from the raw histMatch() function below.

  totalFun <- function(xvals, f = layerFun) {
        if (is.vector(xvals)) 
            xvals <- as.matrix(xvals)
        app <- lapply(1:ncol(xvals), function(i) {
            f[[i]](xvals[, i])
        })
        do.call("cbind", app)
    }
    if (returnFunctions) {
        names(layerFun) <- names(x)
        return(layerFun)
    }
    .vMessage("Apply histogram match functions")
    out <- terra::app(x, fun = totalFun, ...)

Basically the problem is that terra::app() won't follow the order the functions are passed in. I am attempt to create a fix to post a proposed solution to my GitHub bug report. I have made a simpler problem that isolates the issue of passing multiple functions to terra::app() as a reproducible to demonstrate the desired output and the desire approach.

####Install and/or load terra####
if(!require(terra)){
  install.packages(terra)
}


####Example Data Setup####
set.seed(23)#For reproducibility

##Creating a Fake multiband raster##
r1a<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1b<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1c<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")

values(r1a)<-rnorm(10000, 200, 35)
values(r1b)<-runif(10000, 500, 10000)
values(r1c)<-rnorm(10000, 600, 150)

r1<-c(r1a, r1b, r1c)

##Creating a list of functions
fxn1<-function(x){log(x+1)}
fxn2<-function(x){(x)^0.5}
fxn3<-function(x){x-25}

funtime<-list(fxn1, fxn2, fxn3)

####Applying each function to its corresponding band in the fake raster####
##Desired behavior##
out<-list()
for(i in 1:nlyr(r1)){
  out[[i]]<-app(r1[[i]], funtime[[i]])
}

do.call(c, out)

####Desired approach####
##Will throw error##
totalfun<-function(x, f=funtime){
  lapply(1:length(x), function(i){
  out<-f[[i]](x[[i]])
})
  do.call(c, out)
}

result<-app(r1, fun=totalfun)

If anyone can think of a workaround to this problem I would greatly appreciate it, and it may help others attempting to histogram match other multiband imagery. Thanks!

I have been processing some Sentinel 2 data using the "terra" and "RStoolbox" packages of R. I was trying to color balance two tiles of Sentinel 2 imagery using the histMatch() function, but got some very unexpected results...the maximum values for the bands histogram matched image were sometime less than the minimum values for the corresponding band in the reference image. I reported this behavior as a bug on GitHub as issue #117 on the RStoolbox repository. I dug into the source code and was able to isolate the issue. Essentially, histMatch() finds a function for each band in the reference raster that finds the raw value corresponding to a particular quantile in a cumulative distribution function. It then creates a list of functions and attempts to apply them sequentially to each corresponding band in the raster to be process. I pasted in the snippet of code from the raw histMatch() function below.

  totalFun <- function(xvals, f = layerFun) {
        if (is.vector(xvals)) 
            xvals <- as.matrix(xvals)
        app <- lapply(1:ncol(xvals), function(i) {
            f[[i]](xvals[, i])
        })
        do.call("cbind", app)
    }
    if (returnFunctions) {
        names(layerFun) <- names(x)
        return(layerFun)
    }
    .vMessage("Apply histogram match functions")
    out <- terra::app(x, fun = totalFun, ...)

Basically the problem is that terra::app() won't follow the order the functions are passed in. I am attempt to create a fix to post a proposed solution to my GitHub bug report. I have made a simpler problem that isolates the issue of passing multiple functions to terra::app() as a reproducible to demonstrate the desired output and the desire approach.

####Install and/or load terra####
if(!require(terra)){
  install.packages(terra)
}


####Example Data Setup####
set.seed(23)#For reproducibility

##Creating a Fake multiband raster##
r1a<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1b<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1c<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")

values(r1a)<-rnorm(10000, 200, 35)
values(r1b)<-runif(10000, 500, 10000)
values(r1c)<-rnorm(10000, 600, 150)

r1<-c(r1a, r1b, r1c)

##Creating a list of functions
fxn1<-function(x){log(x+1)}
fxn2<-function(x){(x)^0.5}
fxn3<-function(x){x-25}

funtime<-list(fxn1, fxn2, fxn3)

####Applying each function to its corresponding band in the fake raster####
##Desired behavior##
out<-list()
for(i in 1:nlyr(r1)){
  out[[i]]<-app(r1[[i]], funtime[[i]])
}

do.call(c, out)

####Desired approach####
##Will throw error##
totalfun<-function(x, f=funtime){
  lapply(1:length(x), function(i){
  out<-f[[i]](x[[i]])
})
  do.call(c, out)
}

result<-app(r1, fun=totalfun)

If anyone can think of a workaround to this problem I would greatly appreciate it, and it may help others attempting to histogram match other multiband imagery. Thanks!

Share Improve this question asked Nov 22, 2024 at 18:31 Sean McKenzieSean McKenzie 8895 silver badges15 bronze badges 7
  • On desired approach, which error does it throw? – Chris Commented Nov 23, 2024 at 0:09
  • Hi Chris, the error is Error: [app] 'fun' returns a list (should be numeric or matrix). I should note, this error is limited to my simplified problem...histMatch() itself doesn't throw this error, but rather I hit this error when I tried to workaround the current bug. – Sean McKenzie Commented Nov 23, 2024 at 0:19
  • But the results of histMatch are erroneous due to failure to apply functions in order supplied... Might this be a terra issue. Did you step through the non-erroring process and ascertain that the functions were taken our of supplied order? Have you tried the ecdf approach on rast turned to matrices, just a guess. – Chris Commented Nov 23, 2024 at 0:27
  • Exactly. I think it is a terra issue, but not a bug. I don't think app() is designed to allow for more than a single function passed to the fun= argument. I think histMatch() attempts to get around the issue by creating a function to cycle through the functions. I was able to verify that the ecdfs worked correctly when applied individually to the raster bands. – Sean McKenzie Commented Nov 23, 2024 at 0:54
  • Continuing comment above. I think that when histMatch() coerces the data to a matrix, it somehow ends up applying just the first ecdf to all bands and doesn't cycle through the ecdfs. That's why I took out the coercion in the simplified example. My hope is to find a way to apply the function without the matrix coercion. – Sean McKenzie Commented Nov 23, 2024 at 0:59
 |  Show 2 more comments

2 Answers 2

Reset to default 2

Your example output:

x <- do.call(c, out)

Can be obtained with

f1 <- function(x) {
    cbind(fxn1(x[,1]), fxn2(x[,2]), fxn3(x[,3]))
}
y <- app(r1, f1)

all(values(x) == values(y))
#[1] TRUE

And with

ff <- list(fxn1, fxn2, fxn3)
f2 <- function(x) {
    sapply(1:ncol(x), \(i) ff[[i]](x[,i]))
}
z <- app(r1, f2)

all(values(x) == values(z))
#[1] TRUE

It dawned on me that perhaps just inverting when I applied the iteration with lapply() might solve the issue. Now, I have the call to terra::app() inside a named function, and I feed the vector of iteration steps to lapply() and use an anonymous function to iterate through the raster bands and the functions. Here is the solution for the toy example:

####Install and/or load terra####
if(!require(terra)){
  install.packages(terra)
}


####Example Data Setup####
set.seed(23)#For reproducibility

##Creating a Fake multiband raster##
r1a<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1b<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1c<-rast(xmin=-123.1, xmax=-122.1, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")

values(r1a)<-rnorm(10000, 200, 35)
values(r1b)<-runif(10000, 500, 10000)
values(r1c)<-rnorm(10000, 600, 150)

r1<-c(r1a, r1b, r1c)

##Creating a list of functions
fxn1<-function(x){log(x+1)}
fxn2<-function(x){(x)^0.5}
fxn3<-function(x){x-25}

funtime<-list(fxn1, fxn2, fxn3)

####Applying each function to its corresponding band in the fake raster####
##Desired approach and desired behavior##
appfun<-function(x, f){
  result<-app(x, f)
  return(result)
}

tmp<-lapply(1:nlyr(r1), function(i, x, f) appfun(x[[i]], f[[i]]), f=funtime, x=r1)

result<-do.call(c, tmp)

Applying this solution to the histMatch function, I think I have a proposed fix that I will add to my GitHub bug report:

##Load Necessary Packages##
library(terra)
library(RStoolbox)

set.seed(26) #For reproducibility

##Create Fake raster to color balance##
r1a<-rast(xmin=-123.05, xmax=-122, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1b<-rast(xmin=-123.05, xmax=-122, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r1c<-rast(xmin=-123.05, xmax=-122, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")

values(r1a)<-rnorm(10000, 2000, 150)
values(r1b)<-rnorm(10000, 5000, 100)
values(r1c)<-rnorm(10000, 7000, 250)

r1<-c(r1a, r1b, r1c)

##Create Fake reference raster##
r2a<-rast(xmin=-122.05, xmax=-121, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r2b<-rast(xmin=-122.05, xmax=-121, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")
r2c<-rast(xmin=-122.05, xmax=-121, ymin=45, ymax=45.5, nrows=100, ncols=100, crs="epsg:4326")

values(r2a)<-rnorm(10000, 5000, 300)
values(r2b)<-rnorm(10000, 4000, 300)
values(r2c)<-rnorm(10000, 6000, 100)

r2<-c(r2a, r2b, r2c)

#make the rasters a bit more realistic by making some cells NA 
blankem1<-sample(1:10000, 6500, replace=FALSE)
values(r1)[blankem1]<-NA

set.seed(45) #For a different set of NA cells in the reference raster
blankem2<-sample(1:10000, 6500, replace=FALSE)
values(r2)[blankem2]<-NA


##add some extreme values to make the raster cell distribution heteroskedastic##
set.seed(96)#For a different set of cells to be extreme values in the raster to color balance
extreme1<-sample(1:10000, 300, replace=FALSE)
values(r1)[extreme1]<-values(r1)[extreme1]-rnorm(300, 500, 50)

set.seed(53)#For a different set of cells to be extreme values in the reference raster
extreme2<-sample(1:10000, 300, replace=FALSE)
values(r2)[extreme2]<-values(r2)[extreme2]+rnorm(300, 7000, 150)


##Observe the fake raster RGB images and the distribution of values in each band# 
X11()
par(mfrow=c(4,2))
plotRGB(r1[[c(1:3)]], scale=15000, main = "Original RGB Image")
plotRGB(r2[[c(1:3)]], scale=15000, main="Reference RGB Image")
hist(r1[[1]], breaks=100, xlim=c(0, 10000), main = "Original band 1")
hist(r2[[1]], breaks=100, xlim=c(0, 10000), main="Reference band 1")
hist(r1[[2]], breaks=100, xlim=c(0, 10000), main = "Original band 2")
hist(r2[[2]], breaks=100, xlim=c(0, 10000), main="Reference band 2")
hist(r1[[3]], breaks=100, xlim=c(0, 10000), main = "Original band 3")
hist(r2[[3]], breaks=100, xlim=c(0, 10000), main="Reference band 3")

####Demonstrating RStoolbox::histMatch() bug####
##Histogram match r1 to r2##
HM<-histMatch(r1, r2, intersectOnly = FALSE, paired=FALSE)

##Compare the histogram matched RGB image and band distributions to the original and the reference##
X11()
par(mfrow=c(4,3))
plotRGB(r1[[c(1:3)]], scale=15000, main="Original")
plotRGB(r2[[c(1:3)]], scale=15000, main="Reference")
plotRGB(HM[[c(1:3)]], scale=15000, main="HistMatched")
hist(r1[[1]], breaks=100, xlim=c(0, 10000), main="Original Band 1")
hist(r2[[1]], breaks=100, xlim=c(0, 10000), main="Original Band 1")
hist(HM[[1]], breaks=100, xlim=c(0, 10000), main="HistMatched Band 1")
hist(r1[[2]], breaks=100, xlim=c(0, 10000), main="Original Band 2")
hist(r2[[3]], breaks=100, xlim=c(0, 10000), main="Reference Band 2")
hist(HM[[2]], breaks=100, xlim=c(0, 10000), main="HistMatched Band 2")
hist(r1[[3]], breaks=100, xlim=c(0, 10000), main="Original Band 3")
hist(r2[[3]], breaks=100, xlim=c(0, 10000), main="Reference Band 3")
hist(HM[[3]], breaks=100, xlim=c(0, 10000), main="HistMatched Band 3")

####Revised histMatch() function ####
histMatch2<-function (x, ref, xmask = NULL, refmask = NULL, nSamples = 1e+05, 
          intersectOnly = TRUE, paired = TRUE, forceInteger = FALSE, 
          returnFunctions = FALSE, ...) 
{
  nSamples <- min(ncell(ref), nSamples, ncell(ref))
  x <- RStoolbox:::.toTerra(x)
  ref <- RStoolbox:::.toTerra(ref)
  ext <- if (paired | intersectOnly) 
    intersect(ext(x), ext(ref))
  if (paired & is.null(ext)) {
    paired <- FALSE
    warning("Rasters do not overlap. Precise sampling disabled.", 
            call. = FALSE)
  }
  if (nlyr(x) != nlyr(ref)) 
    stop("x and ref must have the same number of layers.")
  if (!is.null(xmask)) {
    RStoolbox:::.vMessage("Apply xmask")
    xfull <- x
    x <- mask(x, xmask)
  }
  if (!is.null(refmask)) {
    .vMessage("Apply refmask")
    ref <- c(ref, refmask)
  }
  RStoolbox:::.vMessage("Extract samples")
  ref.sample <- as.matrix(spatSample(ref, size = nSamples, 
                                     na.rm = TRUE, ext = ext, xy = paired))
  if (!is.null(refmask)) 
    ref.sample <- ref.sample[, -ncol(ref.sample), drop = FALSE]
  if (paired) {
    x.sample <- extract(x, ref.sample[, c("x", "y")])
    if (is.vector(x.sample)) 
      x.sample <- as.matrix(x.sample)
    valid <- complete.cases(x.sample)
    ref.sample <- ref.sample[valid, -c(1:2), drop = FALSE]
    x.sample <- x.sample[valid, , drop = FALSE]
  }
  else {
    x.sample <- as.matrix(spatSample(x, size = nSamples, 
                                     na.rm = T, ext = ext))
  }
  RStoolbox:::.vMessage("Calculate empirical cumulative histograms")
  layerFun <- lapply(1:ncol(x.sample), function(i) {
    source.ecdf <- ecdf(x.sample[, i])
    ecdf.ref <- ecdf(ref.sample[, i])
    kn <- knots(ecdf.ref)
    y <- ecdf.ref(kn)
    minmax <- c(min(values(ref)[i]), max(values(ref)[i]))
    limits <- if (is.na(minmax[1]) || is.na(minmax[2])) {
      range(ref.sample)
    }
    else {
      minmax
    }
    inverse.ref.ecdf <- approxfun(y, kn, method = "linear", 
                                  yleft = limits[1], yright = limits[2], ties = "ordered")
    histMatchFun <- if (grepl("INT", terra::datatype(ref)[i]) | 
                        forceInteger) 
      function(values, na.rm = FALSE) {
        round(inverse.ref.ecdf(source.ecdf(values)))
      }
    else {
      function(values, na.rm = FALSE) {
        inverse.ref.ecdf(source.ecdf(values))
      }
    }
    histMatchFun
  })

  
  appfun<-function(x, f){
    result<-app(x, f)
    return(result)
  }
  
  tmp<-lapply(1:nlyr(x), function(i, x, f) appfun(x[[i]], f[[i]]), f=layerFun, x=x)
  
  out<-do.call(c, tmp)
  
  if (returnFunctions) {
    names(layerFun) <- names(x)
    return(layerFun)
  }
  RStoolbox:::.vMessage("Apply histogram match functions")
  
  if (!is.null(xmask)) 
    out <- merge(xfull, out, ..., overwrite = TRUE)
  names(out) <- names(x)
  out
}

HM2<-histMatch2(r1, r2, intersectOnly = FALSE, paired=FALSE)

##Observing performance of the revised histMatch2()##
X11()
par(mfrow=c(4,3))
plotRGB(r1[[c(1:3)]], scale=15000, main="Original")
plotRGB(r2[[c(1:3)]], scale=15000, main="Reference")
plotRGB(HM2[[c(1:3)]], scale=15000, main="Revised HistMatched")
hist(r1[[1]], breaks=100, xlim=c(0, 10000), main="Original Band 1")
hist(r2[[1]], breaks=100, xlim=c(0, 10000), main="Reference Band 1")
hist(HM2[[1]], breaks=100, xlim=c(0, 10000), main="Revised HistMatched Band 1")
hist(r1[[2]], breaks=100, xlim=c(0, 10000), main="Original Band 2")
hist(r2[[2]], breaks=100, xlim=c(0, 10000), main="Reference Band 2")
hist(HM2[[2]], breaks=100, xlim=c(0, 10000), main="Revised HistMatched Band 2")
hist(r1[[3]], breaks=100, xlim=c(0, 10000), main="Original Band 3")
hist(r2[[3]], breaks=100, xlim=c(0, 10000), main="Reference Band 3")
hist(HM2[[3]], breaks=100, xlim=c(0, 10000), main="Revised HistMatched Band 3")

本文标签: rApplying a list of functions to a multiband raster using terraapp()Stack Overflow