admin管理员组

文章数量:1296338

I am trying to report diagnoses and subdiagnoses counts using gtsummary. With some searching, the closest I've found is this thread that discusses reporting subcategories, but it fails to answer my problem in two ways:

  1. It does not report each category's count, only the subcategories
  2. It does not handle mutually exclusive subcategories and exclude them from the table.

For example: Diagnosis data is stored in two variables. Not all diagnoses or subdiagnoses are present, but I want to report them all while excluding irrelevant subcategories

df = data.frame(id=1:12,
                diag1=factor(c("A","A","A","A","B","B","C","C","C","C","C","C"),
                             levels=c("A","B","C","D")),
                diag2=factor(c("a1","a2","a1","a2","b1","b2","c1","c2","c3","c1","c1","c1"),
                             levels=c("a1","a2","a3","b1","b2","b3","c1","c2","c3"))
               )

The previous example suggests something like:

library(gtsummary)
tbl <- 
  tbl_strata(
    data = df,
    strata = diag1,
    \(df_subset) {
      tbl_summary(
        data = df_subset,
        include = diag2,
        missing = "no",
        label = list(diag1 = "Diagnosis", diag2= "Subdiagnosis")
      ) |> 
        modify_table_body(
          ~.x %>% mutate(across(all_stat_cols(),~gsub("^0.*", "-", .)))
        ) |> 
        remove_row_type(type = "header") |> 
        modify_header(all_stat_cols() ~ "**{level}**")
    },
    bine_with = "tbl_stack"
  )

But here Diagnosis D is omitted and I have no control over excluding irrelevant subdiagnoses in each category. Additionally, I don't have counts reported for diagnosis. Ideally, an output would have counts that look like this where the Diagnoses report "n (%N)" and subdiagnoses report "m (%n)":

N 12
A 4 (33.3%)
a1 2 (50.0%)
a2 2 (50.0%)
a3 0 (0.0%)
B 2 (16.7%)
b1 1 (50.0%)
b2 1 (50.0%)
b3 0 (0%)
C 6 (50.0%)
c1 4 (66.7%)
c2 1 (16.7%)
c3 1 (16.7%)
D 0 (0%)

本文标签: R gtsummary tablescount both category and subcategoryStack Overflow