admin管理员组文章数量:1125090
I am looking to run terra::as.matrix
on a somewhat large raster stack (70 gigs). Getting a std::bad_alloc memory error immediately. I see somewhat similar posted questions (e.g., #562 on the terra github repo and elsewhere here), with various potential solutions, but am uncertain which path is most appropriate for my situation. I do not have experience splitting rasters into chunks or sections, and processing each separately. Hoping there's a way to configure terra for dealing with the issue.
I tried changing terraOptions()
including, E.g., terraOptions(memfrac=0.9)
and terraOptions(steps=55)
.
The raster requires more RAM to process that I have; I boosted the proportion from 0.6 to 0.9. See:
mem_info(x)
------------------------
Memory (GB)
------------------------
check threshold : 1 (memmin)
available : 17.45
allowed (90%) : 15.7
needed (n=1) : 136.66
------------------------
proc in memory : FALSE
nr chunks : 10
------------------------
The spatRaster object is comprised of 8 layers; two for coordinates, the others values from 6 constituent raster layers.
class : SpatRaster
dimensions : 42700, 53693, 8 (nrow, ncol, nlyr)
resolution : 10, 10 (x, y)
extent : 228888.5, 765818.5, 4807436, 5234436 (xmin, xmax, ymin, ymax)
coord. ref. : NAD_1983_CSRS_v6_UTM_Zone_20N
source : spat_6d442d83216d_27972.tif
names : xCoord, yCoord, chm, ndvi, hwba, swba, ...
min values : 5.247202e-07, 5.247202e-07, 0.0000000, 0.0000000, 0.00000000, 0.000000, ...
max values : 1.104672e-02, 8.504140e-03, 0.1438089, 0.1625947, 0.07315826, 0.522824, ...
The issue is related to Error std::bad_alloc using `terra::extract` on large stack and many points and operation on a very large raster in terra causes std::bad_alloc but I couldn't find a solution there to solve my issue.
Here's a reprex of my code:
library(terra)
terraOptions(memfrac=0.9)
#terraOptions(steps = 55)
Trans.env.table <- terra::as.matrix(x)
mem_info(x)
I do see that proc in memory
is defaulting to FALSE.
The code chunk I'm looking to adapt is as follows:
# transRasts = a raster stack of GDM-transformed layers
# put the values from the transformed layers in a table for easy analysis
Trans.env.table <- as.matrix(transRasts)
col.longs<-xFromCol(transRasts)
row.lats<-yFromRow(transRasts)
Cell_Long<-rep(col.longs, times=nrow(transRasts))
Cell_Lat<-rep(row.lats, each=ncol(transRasts), times=1)
Trans.env.table<-cbind(Cell_Long, Cell_Lat, Trans.env.table)
Trans.env.table <- Trans.env.table[complete.cases(Trans.env.table),]
# specify the number of random samples of grid cells to use in the clustering proceedure
n.sub <- 500
# specify the number of community types to derive
n.cat <- 100
# Then take a random sample of grid cells from the transformed environment data
sub.Trans.env <- Trans.env.table[sample(nrow(Trans.env.table), n.sub),]
# Then loop through and determine the predicted dissimilarity between each pair of
# cells in the random set
sub.dissimilarity <- matrix(0, n.sub, n.sub)
colnames(sub.dissimilarity)<-c(1:n.sub)
rownames(sub.dissimilarity)<-c(1:n.sub)
for(i.col in 1:(n.sub-1))
{
for(i.row in (i.col+1):n.sub)
{
ecol.dist <- sum(abs(sub.Trans.env[i.col,c(3:ncol(sub.Trans.env))] -
sub.Trans.env[i.row,c(3:ncol(sub.Trans.env))]))
sub.dissimilarity[i.row,i.col] <- 1 - exp(-1 * (gdmRastMod$intercept + ecol.dist))
sub.dissimilarity[i.col,i.row] <- sub.dissimilarity[i.row,i.col]
} # end for i.row
} # end for i.col
# Now apply heirachical clustering to the subsample dissimilarity matrix
sub.dissimilarity<-as.dist(sub.dissimilarity)
class.results<-hclust(sub.dissimilarity, method = "ward.D")
class.membership <- cutree(class.results, k = n.cat)
# Now run through all grid cells, and allocate them to the class of the
# most similar cell in the training set
# takes 5 mins with 500 samples
cell.class <- rep(1, length=nrow(Trans.env.table))
for(i.cell in 1:nrow(Trans.env.table))
{
max.similarity <- 0
i.cell.class <- 1
for(i.sub in 1:n.sub)
{
ecol.dist <- sum(abs(Trans.env.table[i.cell,c(3:ncol(Trans.env.table))] -
sub.Trans.env[i.sub,c(3:ncol(sub.Trans.env))]))
similarity <- exp(-1 * (gdmRastMod$intercept + ecol.dist))
if(similarity > max.similarity)
{
max.similarity <- similarity
i.cell.class <- class.membership[i.sub]
} # end if
25
} # end for i.sub
cell.class[i.cell] <- i.cell.class
} # end for i.cell
# Convert the results to a raster
gdm.class.ras <- raster(transRasts,layer=1)
gdm.class.ras <- rasterize(Trans.env.table[,c(1:2)],
gdm.class.ras,
field=cell.class)
# Plot the community classes ~~~~~~~~~~~~~
plot(gdm.class.ras)
I am looking to run terra::as.matrix
on a somewhat large raster stack (70 gigs). Getting a std::bad_alloc memory error immediately. I see somewhat similar posted questions (e.g., #562 on the terra github repo and elsewhere here), with various potential solutions, but am uncertain which path is most appropriate for my situation. I do not have experience splitting rasters into chunks or sections, and processing each separately. Hoping there's a way to configure terra for dealing with the issue.
I tried changing terraOptions()
including, E.g., terraOptions(memfrac=0.9)
and terraOptions(steps=55)
.
The raster requires more RAM to process that I have; I boosted the proportion from 0.6 to 0.9. See:
mem_info(x)
------------------------
Memory (GB)
------------------------
check threshold : 1 (memmin)
available : 17.45
allowed (90%) : 15.7
needed (n=1) : 136.66
------------------------
proc in memory : FALSE
nr chunks : 10
------------------------
The spatRaster object is comprised of 8 layers; two for coordinates, the others values from 6 constituent raster layers.
class : SpatRaster
dimensions : 42700, 53693, 8 (nrow, ncol, nlyr)
resolution : 10, 10 (x, y)
extent : 228888.5, 765818.5, 4807436, 5234436 (xmin, xmax, ymin, ymax)
coord. ref. : NAD_1983_CSRS_v6_UTM_Zone_20N
source : spat_6d442d83216d_27972.tif
names : xCoord, yCoord, chm, ndvi, hwba, swba, ...
min values : 5.247202e-07, 5.247202e-07, 0.0000000, 0.0000000, 0.00000000, 0.000000, ...
max values : 1.104672e-02, 8.504140e-03, 0.1438089, 0.1625947, 0.07315826, 0.522824, ...
The issue is related to Error std::bad_alloc using `terra::extract` on large stack and many points and operation on a very large raster in terra causes std::bad_alloc but I couldn't find a solution there to solve my issue.
Here's a reprex of my code:
library(terra)
terraOptions(memfrac=0.9)
#terraOptions(steps = 55)
Trans.env.table <- terra::as.matrix(x)
mem_info(x)
I do see that proc in memory
is defaulting to FALSE.
The code chunk I'm looking to adapt is as follows:
# transRasts = a raster stack of GDM-transformed layers
# put the values from the transformed layers in a table for easy analysis
Trans.env.table <- as.matrix(transRasts)
col.longs<-xFromCol(transRasts)
row.lats<-yFromRow(transRasts)
Cell_Long<-rep(col.longs, times=nrow(transRasts))
Cell_Lat<-rep(row.lats, each=ncol(transRasts), times=1)
Trans.env.table<-cbind(Cell_Long, Cell_Lat, Trans.env.table)
Trans.env.table <- Trans.env.table[complete.cases(Trans.env.table),]
# specify the number of random samples of grid cells to use in the clustering proceedure
n.sub <- 500
# specify the number of community types to derive
n.cat <- 100
# Then take a random sample of grid cells from the transformed environment data
sub.Trans.env <- Trans.env.table[sample(nrow(Trans.env.table), n.sub),]
# Then loop through and determine the predicted dissimilarity between each pair of
# cells in the random set
sub.dissimilarity <- matrix(0, n.sub, n.sub)
colnames(sub.dissimilarity)<-c(1:n.sub)
rownames(sub.dissimilarity)<-c(1:n.sub)
for(i.col in 1:(n.sub-1))
{
for(i.row in (i.col+1):n.sub)
{
ecol.dist <- sum(abs(sub.Trans.env[i.col,c(3:ncol(sub.Trans.env))] -
sub.Trans.env[i.row,c(3:ncol(sub.Trans.env))]))
sub.dissimilarity[i.row,i.col] <- 1 - exp(-1 * (gdmRastMod$intercept + ecol.dist))
sub.dissimilarity[i.col,i.row] <- sub.dissimilarity[i.row,i.col]
} # end for i.row
} # end for i.col
# Now apply heirachical clustering to the subsample dissimilarity matrix
sub.dissimilarity<-as.dist(sub.dissimilarity)
class.results<-hclust(sub.dissimilarity, method = "ward.D")
class.membership <- cutree(class.results, k = n.cat)
# Now run through all grid cells, and allocate them to the class of the
# most similar cell in the training set
# takes 5 mins with 500 samples
cell.class <- rep(1, length=nrow(Trans.env.table))
for(i.cell in 1:nrow(Trans.env.table))
{
max.similarity <- 0
i.cell.class <- 1
for(i.sub in 1:n.sub)
{
ecol.dist <- sum(abs(Trans.env.table[i.cell,c(3:ncol(Trans.env.table))] -
sub.Trans.env[i.sub,c(3:ncol(sub.Trans.env))]))
similarity <- exp(-1 * (gdmRastMod$intercept + ecol.dist))
if(similarity > max.similarity)
{
max.similarity <- similarity
i.cell.class <- class.membership[i.sub]
} # end if
25
} # end for i.sub
cell.class[i.cell] <- i.cell.class
} # end for i.cell
# Convert the results to a raster
gdm.class.ras <- raster(transRasts,layer=1)
gdm.class.ras <- rasterize(Trans.env.table[,c(1:2)],
gdm.class.ras,
field=cell.class)
# Plot the community classes ~~~~~~~~~~~~~
plot(gdm.class.ras)
Share
Improve this question
edited yesterday
Sean Basquill
asked 2 days ago
Sean BasquillSean Basquill
12 bronze badges
1 Answer
Reset to default 0mem_info(x)
suggest that you have 17.5 GB of memory available, but that you need 137 GB to read the entire file into memory (the file size on disk is not the same, because of compression). So you cannot do that.
The need is computed as the number of cells times 8 (for a double precision numeric value)
42700 * 53693 * 8 * 8 / 2^30
#[1] 136.655
Perhaps the more important question is why you think you need as.matrix
at all.
本文标签: rError stdbadalloc using terraasmatrix on larger SpatRasterStack Overflow
版权声明:本文标题:r - Error std::bad_alloc using `terra::as.matrix` on larger SpatRaster - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736630510a1945766.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论