r - Error std::bad_alloc using `terra::as.matrix` on larger SpatRaster - Stack Overflow

IT技术

更新时间：2025-01-1211

admin管理员组
文章数量:1125090

I am looking to run terra::as.matrix on a somewhat large raster stack (70 gigs). Getting a std::bad_alloc memory error immediately. I see somewhat similar posted questions (e.g., #562 on the terra github repo and elsewhere here), with various potential solutions, but am uncertain which path is most appropriate for my situation. I do not have experience splitting rasters into chunks or sections, and processing each separately. Hoping there's a way to configure terra for dealing with the issue.

I tried changing terraOptions() including, E.g., terraOptions(memfrac=0.9) and terraOptions(steps=55).

The raster requires more RAM to process that I have; I boosted the proportion from 0.6 to 0.9. See:

mem_info(x)

------------------------
Memory (GB) 
------------------------
check threshold : 1 (memmin)
available       : 17.45
allowed (90%)   : 15.7
needed (n=1)    : 136.66
------------------------
proc in memory  : FALSE
nr chunks       : 10
------------------------

The spatRaster object is comprised of 8 layers; two for coordinates, the others values from 6 constituent raster layers.

class       : SpatRaster 
dimensions  : 42700, 53693, 8  (nrow, ncol, nlyr)
resolution  : 10, 10  (x, y)
extent      : 228888.5, 765818.5, 4807436, 5234436  (xmin, xmax, ymin, ymax)
coord. ref. : NAD_1983_CSRS_v6_UTM_Zone_20N 
source      : spat_6d442d83216d_27972.tif 
names       :       xCoord,       yCoord,       chm,      ndvi,       hwba,     swba, ... 
min values  : 5.247202e-07, 5.247202e-07, 0.0000000, 0.0000000, 0.00000000, 0.000000, ... 
max values  : 1.104672e-02, 8.504140e-03, 0.1438089, 0.1625947, 0.07315826, 0.522824, ...

The issue is related to Error std::bad_alloc using `terra::extract` on large stack and many points and operation on a very large raster in terra causes std::bad_alloc but I couldn't find a solution there to solve my issue.

Here's a reprex of my code:

library(terra)

terraOptions(memfrac=0.9)
#terraOptions(steps = 55)

Trans.env.table <- terra::as.matrix(x)

mem_info(x)

I do see that proc in memory is defaulting to FALSE.

The code chunk I'm looking to adapt is as follows:

# transRasts = a raster stack of GDM-transformed layers
# put the values from the transformed layers in a table for easy analysis  
Trans.env.table <- as.matrix(transRasts)  
col.longs<-xFromCol(transRasts)  
row.lats<-yFromRow(transRasts)  
Cell_Long<-rep(col.longs, times=nrow(transRasts))  
Cell_Lat<-rep(row.lats, each=ncol(transRasts), times=1)  
Trans.env.table<-cbind(Cell_Long, Cell_Lat, Trans.env.table)  
Trans.env.table <- Trans.env.table[complete.cases(Trans.env.table),]

# specify the number of random samples of grid cells to use in the clustering proceedure
n.sub <- 500
# specify the number of community types to derive
n.cat <- 100

# Then take a random sample of grid cells from the transformed environment data
sub.Trans.env <- Trans.env.table[sample(nrow(Trans.env.table), n.sub),]
# Then loop through and determine the predicted dissimilarity between each pair of
# cells in the random set

sub.dissimilarity <- matrix(0, n.sub, n.sub)
colnames(sub.dissimilarity)<-c(1:n.sub)
rownames(sub.dissimilarity)<-c(1:n.sub)
for(i.col in 1:(n.sub-1))
{
for(i.row in (i.col+1):n.sub)
{
ecol.dist <- sum(abs(sub.Trans.env[i.col,c(3:ncol(sub.Trans.env))] -
sub.Trans.env[i.row,c(3:ncol(sub.Trans.env))]))
sub.dissimilarity[i.row,i.col] <- 1 - exp(-1 * (gdmRastMod$intercept + ecol.dist))
sub.dissimilarity[i.col,i.row] <- sub.dissimilarity[i.row,i.col]
} # end for i.row
} # end for i.col
# Now apply heirachical clustering to the subsample dissimilarity matrix
sub.dissimilarity<-as.dist(sub.dissimilarity)
class.results<-hclust(sub.dissimilarity, method = "ward.D")
class.membership <- cutree(class.results, k = n.cat)
# Now run through all grid cells, and allocate them to the class of the
# most similar cell in the training set
# takes 5 mins with 500 samples
cell.class <- rep(1, length=nrow(Trans.env.table))
for(i.cell in 1:nrow(Trans.env.table))
{
max.similarity <- 0
i.cell.class <- 1
for(i.sub in 1:n.sub)
{
ecol.dist <- sum(abs(Trans.env.table[i.cell,c(3:ncol(Trans.env.table))] -
sub.Trans.env[i.sub,c(3:ncol(sub.Trans.env))]))
similarity <- exp(-1 * (gdmRastMod$intercept + ecol.dist))
if(similarity > max.similarity)
{
max.similarity <- similarity
i.cell.class <- class.membership[i.sub]
} # end if
25
} # end for i.sub
cell.class[i.cell] <- i.cell.class
} # end for i.cell

# Convert the results to a raster
gdm.class.ras <- raster(transRasts,layer=1)
gdm.class.ras <- rasterize(Trans.env.table[,c(1:2)],
gdm.class.ras,
field=cell.class)
# Plot the community classes ~~~~~~~~~~~~~
plot(gdm.class.ras)

I am looking to run terra::as.matrix on a somewhat large raster stack (70 gigs). Getting a std::bad_alloc memory error immediately. I see somewhat similar posted questions (e.g., #562 on the terra github repo and elsewhere here), with various potential solutions, but am uncertain which path is most appropriate for my situation. I do not have experience splitting rasters into chunks or sections, and processing each separately. Hoping there's a way to configure terra for dealing with the issue.

I tried changing terraOptions() including, E.g., terraOptions(memfrac=0.9) and terraOptions(steps=55).

The raster requires more RAM to process that I have; I boosted the proportion from 0.6 to 0.9. See:

mem_info(x)

------------------------
Memory (GB) 
------------------------
check threshold : 1 (memmin)
available       : 17.45
allowed (90%)   : 15.7
needed (n=1)    : 136.66
------------------------
proc in memory  : FALSE
nr chunks       : 10
------------------------

The spatRaster object is comprised of 8 layers; two for coordinates, the others values from 6 constituent raster layers.

class       : SpatRaster 
dimensions  : 42700, 53693, 8  (nrow, ncol, nlyr)
resolution  : 10, 10  (x, y)
extent      : 228888.5, 765818.5, 4807436, 5234436  (xmin, xmax, ymin, ymax)
coord. ref. : NAD_1983_CSRS_v6_UTM_Zone_20N 
source      : spat_6d442d83216d_27972.tif 
names       :       xCoord,       yCoord,       chm,      ndvi,       hwba,     swba, ... 
min values  : 5.247202e-07, 5.247202e-07, 0.0000000, 0.0000000, 0.00000000, 0.000000, ... 
max values  : 1.104672e-02, 8.504140e-03, 0.1438089, 0.1625947, 0.07315826, 0.522824, ...

The issue is related to Error std::bad_alloc using `terra::extract` on large stack and many points and operation on a very large raster in terra causes std::bad_alloc but I couldn't find a solution there to solve my issue.

Here's a reprex of my code:

library(terra)

terraOptions(memfrac=0.9)
#terraOptions(steps = 55)

Trans.env.table <- terra::as.matrix(x)

mem_info(x)

I do see that proc in memory is defaulting to FALSE.

The code chunk I'm looking to adapt is as follows:

# transRasts = a raster stack of GDM-transformed layers
# put the values from the transformed layers in a table for easy analysis  
Trans.env.table <- as.matrix(transRasts)  
col.longs<-xFromCol(transRasts)  
row.lats<-yFromRow(transRasts)  
Cell_Long<-rep(col.longs, times=nrow(transRasts))  
Cell_Lat<-rep(row.lats, each=ncol(transRasts), times=1)  
Trans.env.table<-cbind(Cell_Long, Cell_Lat, Trans.env.table)  
Trans.env.table <- Trans.env.table[complete.cases(Trans.env.table),]

# specify the number of random samples of grid cells to use in the clustering proceedure
n.sub <- 500
# specify the number of community types to derive
n.cat <- 100

# Then take a random sample of grid cells from the transformed environment data
sub.Trans.env <- Trans.env.table[sample(nrow(Trans.env.table), n.sub),]
# Then loop through and determine the predicted dissimilarity between each pair of
# cells in the random set

sub.dissimilarity <- matrix(0, n.sub, n.sub)
colnames(sub.dissimilarity)<-c(1:n.sub)
rownames(sub.dissimilarity)<-c(1:n.sub)
for(i.col in 1:(n.sub-1))
{
for(i.row in (i.col+1):n.sub)
{
ecol.dist <- sum(abs(sub.Trans.env[i.col,c(3:ncol(sub.Trans.env))] -
sub.Trans.env[i.row,c(3:ncol(sub.Trans.env))]))
sub.dissimilarity[i.row,i.col] <- 1 - exp(-1 * (gdmRastMod$intercept + ecol.dist))
sub.dissimilarity[i.col,i.row] <- sub.dissimilarity[i.row,i.col]
} # end for i.row
} # end for i.col
# Now apply heirachical clustering to the subsample dissimilarity matrix
sub.dissimilarity<-as.dist(sub.dissimilarity)
class.results<-hclust(sub.dissimilarity, method = "ward.D")
class.membership <- cutree(class.results, k = n.cat)
# Now run through all grid cells, and allocate them to the class of the
# most similar cell in the training set
# takes 5 mins with 500 samples
cell.class <- rep(1, length=nrow(Trans.env.table))
for(i.cell in 1:nrow(Trans.env.table))
{
max.similarity <- 0
i.cell.class <- 1
for(i.sub in 1:n.sub)
{
ecol.dist <- sum(abs(Trans.env.table[i.cell,c(3:ncol(Trans.env.table))] -
sub.Trans.env[i.sub,c(3:ncol(sub.Trans.env))]))
similarity <- exp(-1 * (gdmRastMod$intercept + ecol.dist))
if(similarity > max.similarity)
{
max.similarity <- similarity
i.cell.class <- class.membership[i.sub]
} # end if
25
} # end for i.sub
cell.class[i.cell] <- i.cell.class
} # end for i.cell

# Convert the results to a raster
gdm.class.ras <- raster(transRasts,layer=1)
gdm.class.ras <- rasterize(Trans.env.table[,c(1:2)],
gdm.class.ras,
field=cell.class)
# Plot the community classes ~~~~~~~~~~~~~
plot(gdm.class.ras)

Share Improve this question edited yesterday asked 2 days ago Sean Basquill 12 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

mem_info(x) suggest that you have 17.5 GB of memory available, but that you need 137 GB to read the entire file into memory (the file size on disk is not the same, because of compression). So you cannot do that.

The need is computed as the number of cells times 8 (for a double precision numeric value)

42700 * 53693 * 8 * 8 / 2^30
#[1] 136.655

Perhaps the more important question is why you think you need as.matrix at all.

本文标签： rError stdbadalloc using terraasmatrix on larger SpatRasterStack Overflow

版权声明：本文标题：r - Error std::bad_alloc using `terra::as.matrix` on larger SpatRaster - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736630510a1945766.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

r - Error std::bad_alloc using `terra::as.matrix` on larger SpatRaster - Stack Overflow

1 Answer 1

更多相关文章