admin管理员组

文章数量:1316000

For a study, I need to generate five complete data sets for each of the 100 incomplete data sets with the help of mice package in R.

This code is working correctly (when you have df1 dataset): df1_imp <- mice(df1, m = 5, method = 'logreg', print = F) Then, we can access the full data sets (5) produced as follows:

dataset1 <- complete(df1_imp, 1)
dataset2 <- complete(df1_imp, 2)
dataset3 <- complete(df1_imp, 3)
dataset4 <- complete(df1_imp, 4)
dataset5 <- complete(df1_imp, 5)

Fine. However, I have 100 incomplete data sets. Each will yield 5 complete data sets (500 in total). How can I view these 500 data sets? Because I'm going to analyze them.

[dfs] MY DATASET LIST (each set must produce 5 complete datasets, 3x5 = 15)

list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 
0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA, 
NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1, 
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA, 
1, 0, 0, 0, 1, 1, 0), dim = 6:5))

For a study, I need to generate five complete data sets for each of the 100 incomplete data sets with the help of mice package in R.

This code is working correctly (when you have df1 dataset): df1_imp <- mice(df1, m = 5, method = 'logreg', print = F) Then, we can access the full data sets (5) produced as follows:

dataset1 <- complete(df1_imp, 1)
dataset2 <- complete(df1_imp, 2)
dataset3 <- complete(df1_imp, 3)
dataset4 <- complete(df1_imp, 4)
dataset5 <- complete(df1_imp, 5)

Fine. However, I have 100 incomplete data sets. Each will yield 5 complete data sets (500 in total). How can I view these 500 data sets? Because I'm going to analyze them.

[dfs] MY DATASET LIST (each set must produce 5 complete datasets, 3x5 = 15)

list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 
0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA, 
NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1, 
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA, 
1, 0, 0, 0, 1, 1, 0), dim = 6:5))
Share Improve this question asked Jan 30 at 7:04 MetehanGungorMetehanGungor 1691 silver badge12 bronze badges 3
  • 1 There is no need to view the data, apart from checking imputation qc. Run analysis on all datasets, then use pool to get summary result over all the datasets. See these links: rmisstasticlify.app/tutorials/… and stackoverflow/questions/51370292/… – zx8754 Commented Jan 30 at 8:10
  • 4 Use (eg) lapply to process each of your 100 incomplete datasets. Something like dfs_imp_all <- lapply(dfs, mice, m = 5, method = 'logreg', print = FALSE) [untested code]. dfs_imp_all will be a list of 100 elements. Each element will contain the 5 imputed datasets for the corresponding element of dfs. – Limey Commented Jan 30 at 8:27
  • Thank u @Limey, If you enter this as an answer, I may accept it as the correct answer. – MetehanGungor Commented Jan 31 at 8:35
Add a comment  | 

1 Answer 1

Reset to default 0

In complete, select action='all' and include=FALSE to exclude the un-imputed dataset. For simulation studies you may want to specify a seed.

> library(mice)
> seed. <- 42
> lapply(raw_data, mice, m=5, method='pmm', seed=seed., printFlag=FALSE) |> 
+   lapply(complete, action='all', include=FALSE)
[[1]]
$`1`
  V1 V2 V3 V4 V5
1  1  1  0  0  0
2  0  0  0  1  1
3  0  1  1  0  1
4  1  1  0  1  1
5  0  0  1  1  1
6  0  0  1  0  1

$`2`
  V1 V2 V3 V4 V5
1  1  1  0  0  0
2  0  0  0  1  1
3  0  1  1  0  1
4  1  1  0  1  1
5  0  0  1  1  1
6  0  0  1  0  1

$`3`
  V1 V2 V3 V4 V5
1  1  1  0  0  0
2  0  0  0  1  1
3  0  1  1  0  1
4  1  1  0  1  1
5  0  0  1  1  1
6  0  0  1  0  1

$`4`
  V1 V2 V3 V4 V5
1  1  1  0  0  0
2  0  0  0  1  1
3  0  1  1  0  1
4  1  1  0  1  1
5  0  0  1  1  1
6  0  0  1  0  1

$`5`
  V1 V2 V3 V4 V5
1  1  1  0  0  0
2  0  0  0  1  1
3  0  1  1  0  1
4  1  1  0  1  1
5  0  0  1  0  1
6  0  0  1  0  1

attr(,"class")
[1] "mild" "list"

[[2]]
$`1`
  V1 V2 V3 V4 V5
1  1  0  0  1  0
2  1  0  0  0  1
3  0  0  1  1  1
4  1  0  1  0  1
5  1  1  0  0  1
6  0  0  1  1  1

$`2`
  V1 V2 V3 V4 V5
1  1  0  0  1  0
2  1  0  0  0  1
3  0  0  1  1  1
4  1  0  1  0  1
5  1  1  0  0  1
6  0  0  1  1  1

$`3`
  V1 V2 V3 V4 V5
1  1  0  0  1  0
2  1  0  0  0  1
3  0  0  1  1  1
4  1  0  1  0  1
5  1  1  0  0  1
6  0  0  1  1  1

$`4`
  V1 V2 V3 V4 V5
1  1  0  0  1  0
2  1  0  0  0  1
3  0  0  1  1  1
4  1  0  1  0  1
5  1  1  0  0  1
6  0  0  1  1  1

$`5`
  V1 V2 V3 V4 V5
1  1  0  0  1  0
2  1  0  0  0  1
3  0  0  1  1  1
4  1  0  1  1  1
5  1  1  0  0  1
6  0  0  1  1  1

attr(,"class")
[1] "mild" "list"

[[3]]
$`1`
  V1 V2 V3 V4 V5
1  1  1  0 NA  0
2  0  0  0  1  0
3  1  1  1  0  0
4  0  0  1  1  1
5  0  0  1 NA  1
6  0  0  0  1  0

$`2`
  V1 V2 V3 V4 V5
1  1  1  0 NA  0
2  0  0  0  1  0
3  1  1  1  0  0
4  0  0  1  1  1
5  0  0  1 NA  1
6  0  0  0  1  0

$`3`
  V1 V2 V3 V4 V5
1  1  1  0 NA  0
2  0  0  0  1  0
3  1  1  1  0  0
4  0  0  1  1  1
5  0  0  1 NA  1
6  0  0  0  1  0

$`4`
  V1 V2 V3 V4 V5
1  1  1  0 NA  0
2  0  0  0  1  0
3  1  1  1  0  0
4  0  0  1  1  1
5  0  0  1 NA  1
6  0  0  0  1  0

$`5`
  V1 V2 V3 V4 V5
1  1  1  0 NA  0
2  0  0  0  1  0
3  1  1  1  0  0
4  0  0  1  1  1
5  0  0  1 NA  1
6  0  0  0  1  0

attr(,"class")
[1] "mild" "list"

Warning messages:
1: Number of logged events: 30 
2: Number of logged events: 30 
3: Number of logged events: 2 

Notes

  1. For a serious simulation study, you probably need to set m= somewhat higher, see an earlier answer.
  2. In your example, imputation of the third dataset fails due to collinearities. You can investigate by setting printFlag=TRUE and not piping into complete.

Data:

> dput(raw_data)
list(structure(c(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 
0, 1, 1, 0, 1, NA, 1, NA, 0, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA, 
NA, 0, 1, 0, 1, 1, 1, 1, 1), dim = 6:5), structure(c(1, 0, 1, 
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, NA, 1, 0, 1, NA, 
1, 0, 0, 0, 1, 1, 0), dim = 6:5))

本文标签: