admin管理员组文章数量:1316000
For a study, I need to generate five complete data sets for each of the 100 incomplete data sets with the help of mice
package in R.
This code is working correctly (when you have df1
dataset):
df1_imp <- mice(df1, m = 5, method = 'logreg', print = F)
Then, we can access the full data sets (5) produced as follows:
dataset1 <- complete(df1_imp, 1)
dataset2 <- complete(df1_imp, 2)
dataset3 <- complete(df1_imp, 3)
dataset4 <- complete(df1_imp, 4)
dataset5 <- complete(df1_imp, 5)
Fine. However, I have 100 incomplete data sets. Each will yield 5 complete data sets (500 in total). How can I view these 500 data sets? Because I'm going to analyze them.
[dfs] MY DATASET LIST (each set must produce 5 complete datasets, 3x5 = 15)
list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1,
0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1,
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA,
NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1,
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA,
1, 0, 0, 0, 1, 1, 0), dim = 6:5))
For a study, I need to generate five complete data sets for each of the 100 incomplete data sets with the help of mice
package in R.
This code is working correctly (when you have df1
dataset):
df1_imp <- mice(df1, m = 5, method = 'logreg', print = F)
Then, we can access the full data sets (5) produced as follows:
dataset1 <- complete(df1_imp, 1)
dataset2 <- complete(df1_imp, 2)
dataset3 <- complete(df1_imp, 3)
dataset4 <- complete(df1_imp, 4)
dataset5 <- complete(df1_imp, 5)
Fine. However, I have 100 incomplete data sets. Each will yield 5 complete data sets (500 in total). How can I view these 500 data sets? Because I'm going to analyze them.
[dfs] MY DATASET LIST (each set must produce 5 complete datasets, 3x5 = 15)
list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1,
0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1,
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA,
NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1,
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA,
1, 0, 0, 0, 1, 1, 0), dim = 6:5))
Share
Improve this question
asked Jan 30 at 7:04
MetehanGungorMetehanGungor
1691 silver badge12 bronze badges
3
|
1 Answer
Reset to default 0In complete
, select action='all'
and include=FALSE
to exclude the un-imputed dataset. For simulation studies you may want to specify a seed
.
> library(mice)
> seed. <- 42
> lapply(raw_data, mice, m=5, method='pmm', seed=seed., printFlag=FALSE) |>
+ lapply(complete, action='all', include=FALSE)
[[1]]
$`1`
V1 V2 V3 V4 V5
1 1 1 0 0 0
2 0 0 0 1 1
3 0 1 1 0 1
4 1 1 0 1 1
5 0 0 1 1 1
6 0 0 1 0 1
$`2`
V1 V2 V3 V4 V5
1 1 1 0 0 0
2 0 0 0 1 1
3 0 1 1 0 1
4 1 1 0 1 1
5 0 0 1 1 1
6 0 0 1 0 1
$`3`
V1 V2 V3 V4 V5
1 1 1 0 0 0
2 0 0 0 1 1
3 0 1 1 0 1
4 1 1 0 1 1
5 0 0 1 1 1
6 0 0 1 0 1
$`4`
V1 V2 V3 V4 V5
1 1 1 0 0 0
2 0 0 0 1 1
3 0 1 1 0 1
4 1 1 0 1 1
5 0 0 1 1 1
6 0 0 1 0 1
$`5`
V1 V2 V3 V4 V5
1 1 1 0 0 0
2 0 0 0 1 1
3 0 1 1 0 1
4 1 1 0 1 1
5 0 0 1 0 1
6 0 0 1 0 1
attr(,"class")
[1] "mild" "list"
[[2]]
$`1`
V1 V2 V3 V4 V5
1 1 0 0 1 0
2 1 0 0 0 1
3 0 0 1 1 1
4 1 0 1 0 1
5 1 1 0 0 1
6 0 0 1 1 1
$`2`
V1 V2 V3 V4 V5
1 1 0 0 1 0
2 1 0 0 0 1
3 0 0 1 1 1
4 1 0 1 0 1
5 1 1 0 0 1
6 0 0 1 1 1
$`3`
V1 V2 V3 V4 V5
1 1 0 0 1 0
2 1 0 0 0 1
3 0 0 1 1 1
4 1 0 1 0 1
5 1 1 0 0 1
6 0 0 1 1 1
$`4`
V1 V2 V3 V4 V5
1 1 0 0 1 0
2 1 0 0 0 1
3 0 0 1 1 1
4 1 0 1 0 1
5 1 1 0 0 1
6 0 0 1 1 1
$`5`
V1 V2 V3 V4 V5
1 1 0 0 1 0
2 1 0 0 0 1
3 0 0 1 1 1
4 1 0 1 1 1
5 1 1 0 0 1
6 0 0 1 1 1
attr(,"class")
[1] "mild" "list"
[[3]]
$`1`
V1 V2 V3 V4 V5
1 1 1 0 NA 0
2 0 0 0 1 0
3 1 1 1 0 0
4 0 0 1 1 1
5 0 0 1 NA 1
6 0 0 0 1 0
$`2`
V1 V2 V3 V4 V5
1 1 1 0 NA 0
2 0 0 0 1 0
3 1 1 1 0 0
4 0 0 1 1 1
5 0 0 1 NA 1
6 0 0 0 1 0
$`3`
V1 V2 V3 V4 V5
1 1 1 0 NA 0
2 0 0 0 1 0
3 1 1 1 0 0
4 0 0 1 1 1
5 0 0 1 NA 1
6 0 0 0 1 0
$`4`
V1 V2 V3 V4 V5
1 1 1 0 NA 0
2 0 0 0 1 0
3 1 1 1 0 0
4 0 0 1 1 1
5 0 0 1 NA 1
6 0 0 0 1 0
$`5`
V1 V2 V3 V4 V5
1 1 1 0 NA 0
2 0 0 0 1 0
3 1 1 1 0 0
4 0 0 1 1 1
5 0 0 1 NA 1
6 0 0 0 1 0
attr(,"class")
[1] "mild" "list"
Warning messages:
1: Number of logged events: 30
2: Number of logged events: 30
3: Number of logged events: 2
Notes
- For a serious simulation study, you probably need to set
m=
somewhat higher, see an earlier answer. - In your example, imputation of the third dataset fails due to collinearities. You can investigate by setting
printFlag=TRUE
and not piping intocomplete
.
Data:
> dput(raw_data)
list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1,
0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1,
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA,
NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1,
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA,
1, 0, 0, 0, 1, 1, 0), dim = 6:5))
本文标签:
版权声明:本文标题:Creating 5 complete data sets from one incomplete data set in a simulation study [mice package in R] - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741980813a2408384.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
lapply
to process each of your 100 incomplete datasets. Something likedfs_imp_all <- lapply(dfs, mice, m = 5, method = 'logreg', print = FALSE)
[untested code].dfs_imp_all
will be a list of 100 elements. Each element will contain the 5 imputed datasets for the corresponding element ofdfs
. – Limey Commented Jan 30 at 8:27