admin管理员组文章数量:1317898
I want manually to set the number of predictors for lasso to 5, including the intercept term if any, so I can compare which predictors lasso selects to a different model. With real data and about a hundred potential predictors, in glmnet I set dfmax = 4. The number of non-zero coefficients in the solution is 7, not 4. Is klunky iterative fudging of lambda my best option?
Example:
library(glmnet)
set.seed(42)
n.obs <- 100
n.predictors <- 10
y <- rnorm(n.obs)
x <- matrix(rnorm(n.obs * n.predictors), ncol = n.predictors)
# Add some pattern to the data and give predictors different strengths
for (i in 1:n.predictors) {x[ , i] <- x[ , i] + y * i}
m.lambda <- cv.glmnet(x = x, y = y, alpha = 1)$lambda.min
m.lasso <- glmnet(x = x, y = y, alpha = 1, lambda = m.lambda, dfmax = 2)
coef(m.lasso)
# All predictors are retained
Help for glmnet says dfmax is to “Limit the maximum number of variables in the model. Useful for very large nvars, if a partial path is desired.” Part of the problem may be that I don’t understand what a partial path is in this context and Googling hasn’t helped. (For example, in Shortest Partial Path in a Graph, “partial path - is a path that doesn't have to visit every node”).
This question is the same as one of the questions in Behaviour of dfmax in glmnet, which looked perfect but has no answers. is relevant to me but closed as off-topic for CrossValidated. Its only comment and only answer seem to contradict each other. They do suggest the possibility that dfmax limits lasso to a specific set of predictors, not a specific count.
Follow-up: I ended up coding an iteration to find the multiplier for m.lambda that would yield the predictor count I wanted. A simple split-the-difference approach that updated floor and ceiling values for the multiplier converged readily for each of my half-dozen equations. I'll leave this question open in case someone does know and can share how to use dfmax, or failing that, for someone who wants a reminder that a homemade workaround may be easy.
本文标签: rLimiting predictor count in glmnetStack Overflow
版权声明:本文标题:r - Limiting predictor count in glmnet - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742035847a2417287.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论