admin管理员组文章数量:1123931
I would like to run iterations of a single model, substituting one of a set of 34 different response variables in each iteration, and organize the results (from summary()
) of all of those models into a table to be exported with write.csv()
, with the results arranged row-wise, naming each row with the character name (not the column index) of each response variable.
I think that "all" I need to do is use apply()
to achieve this, but how can I implement it?
Example model
mod1 <- gam(response.1+1 ~ s(predictor), bs = "cs", family = Gamma(link = "log"), method = "REML", data = ex.dat)
summary(mod1)
Example data: `
ex.dat <-
structure(list(id = 1:20, site = c(1L, 1L, 1L, 1L, 1L, 1L,
1L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), predictor = c(3660.0384,
3660.0384, 3660.0384, 3660.0384, 3660.0384, 3660.0384, 3660.0384,
3499.7136, 3499.7136, 3499.7136, 3499.7136, 3477.1584, 3477.1584,
3477.1584, 3477.1584, 3477.1584, 3477.1584, 3477.1584, 3477.1584,
3477.1584), response.1 = c(65012L, 688717L, 0L, 0L, 338092L,
216134L, 0L, 1719602L, 692439L, 0L, 0L, 432584L, 131228L, 113546L,
0L, 928748L, 111161L, 393458L, 272371L, 2650333L), response.2 = c(114937L,
0L, 0L, 0L, 736729L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 291939L, 232576L,
0L, 0L, 0L, 0L, 0L, 0L), response.3 = c(709637L, 2487151L, 197847L,
346499L, 2018978L, 617447L, 2467347L, 0L, 0L, 839880L, 637807L,
3345034L, 3315138L, 739640L, 3739090L, 9092537L, 5014285L, 4383957L,
4723850L, 6943785L)), row.names = c(NA, 20L), class = "data.frame")
I would like to run iterations of a single model, substituting one of a set of 34 different response variables in each iteration, and organize the results (from summary()
) of all of those models into a table to be exported with write.csv()
, with the results arranged row-wise, naming each row with the character name (not the column index) of each response variable.
I think that "all" I need to do is use apply()
to achieve this, but how can I implement it?
Example model
mod1 <- gam(response.1+1 ~ s(predictor), bs = "cs", family = Gamma(link = "log"), method = "REML", data = ex.dat)
summary(mod1)
Example data: `
ex.dat <-
structure(list(id = 1:20, site = c(1L, 1L, 1L, 1L, 1L, 1L,
1L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), predictor = c(3660.0384,
3660.0384, 3660.0384, 3660.0384, 3660.0384, 3660.0384, 3660.0384,
3499.7136, 3499.7136, 3499.7136, 3499.7136, 3477.1584, 3477.1584,
3477.1584, 3477.1584, 3477.1584, 3477.1584, 3477.1584, 3477.1584,
3477.1584), response.1 = c(65012L, 688717L, 0L, 0L, 338092L,
216134L, 0L, 1719602L, 692439L, 0L, 0L, 432584L, 131228L, 113546L,
0L, 928748L, 111161L, 393458L, 272371L, 2650333L), response.2 = c(114937L,
0L, 0L, 0L, 736729L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 291939L, 232576L,
0L, 0L, 0L, 0L, 0L, 0L), response.3 = c(709637L, 2487151L, 197847L,
346499L, 2018978L, 617447L, 2467347L, 0L, 0L, 839880L, 637807L,
3345034L, 3315138L, 739640L, 3739090L, 9092537L, 5014285L, 4383957L,
4723850L, 6943785L)), row.names = c(NA, 20L), class = "data.frame")
Share
Improve this question
edited yesterday
JKO
asked yesterday
JKOJKO
3071 silver badge13 bronze badges
1
|
1 Answer
Reset to default 2You could inject the symbol for each response variable into the formula using rlang::expr()
:
fit_gam <- function(response_var) {
# inject (using !!) the response_var into the formula:
formula <- rlang::eval_tidy(
rlang::expr(!!response_var + 1 ~ s(predictor, k = 3))
)
# pass the formula to `gam()`
mod1 <- mgcv::gam(formula, bs = "cs", family = Gamma(link = "log"),
method = "REML", data = ex.dat)
summary(mod1)
}
df <-
# generate a list of symbols for your response variables
lapply(paste0("response.", 1:3), as.symbol) |>
# map this to `fit_gam()`
lapply(fit_gam) |>
# row-bind the summary objects to make a data frame
do.call(rbind, args = _) |> as.data.frame()
Some notes:
- Since
summary.gam()
dumps so much info, the resulting data frame is very wide and has some columns for which each cell is a table in itself. You won't be able to write this to a csv. You could write a wrapper to use in place ofsummary()
that just gets the info you want, for example:
extract_fit_info <- function(fit) {
with(summary(fit), c(
"R^2" = r.sq,
"Intercept" = p.coeff[["(Intercept)"]]
))
}
When we drop this in in place of summary()
, we get:
df
#> R^2 Intercept
#> 1 0.02320420 12.869125
#> 2 -0.04244942 9.045422
#> 3 0.50855255 14.399754
- If your variable names don't all follow a pattern, you could input them using
alist()
. For example, instead oflapply(paste0("response.", 1:3), as.symbol)
, do:
alist(response.1, response.2, response.3)
- You can also do this in base R using
base::call()
. Instead offormula <- rlang::eval_tidy( ... )
, use:
formula <- as.formula(
call("~", call("+", response_var, 1), quote(s(predictor, k = 3)))
)
I think it is more readable with rlang, though.
We use
lapply()
instead ofapply()
sincelapply()
works well for lists whileapply()
is mostly useful for matrices.purrr::map()
is another good option here.Note that I added
k = 3
to the formula so it wouldn't throw an error.
本文标签: rRunning parallel models and compiling resultsStack Overflow
版权声明:本文标题:r - Running parallel models and compiling results - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736610480a1945431.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
response.1
andpredictor
are in your model formula but not your example data. – Eonema Commented yesterday