

I was wondering if there is a way to add confidence intervals for the row percentages created using gtsummary.

Example code:

# Load required packages

# Set seed for reproducibility

# Create a reproducible dataset
n <- 300
data <- data.frame(
  treatment = sample(c("Control", "Intervention"), n, replace = TRUE),
  sex = sample(c("Male", "Female", NA), n, replace = TRUE, prob = c(0.48, 0.48, 0.04)),
  age = round(rnorm(n, mean = 55, sd = 12), 0),
  bmi = round(rnorm(n, mean = 28, sd = 5), 1),
  smoker = sample(c("Yes", "No", NA), n, replace = TRUE, prob = c(0.2, 0.75, 0.05))

# Create a survey design object (required for tbl_svysummary)
des <- svydesign(ids = ~1, data = data)

# Define the variables to include in the summary table
shared_variables <- c("sex", "age", "bmi", "smoker")

# Optionally, create a custom label function for better table display
create_labels <- function() {
    sex    = "Sex",
    age    = "Age (years)",
    bmi    = "BMI",
    smoker = "Smoking Status"

# Create the survey summary table with row percentages

tbl <- tbl_svysummary(
  data = des,
  by = treatment,  # Grouping variable (can be binary or categorical)
  include = shared_variables,
  missing = "always",
  percent = "row",  # Row percentages
  missing_text = "Missing/Refused",
  digits = list(
    all_categorical() ~ c(0, 0, 3),
    all_continuous()  ~ 1
  label = create_labels(),
  statistic = list(
    all_categorical() ~ "{n} ({p}%) {p.std.error} {N_unweighted}"

add_ci() only computes column confidence intervals. Note:

  • The variable for the by argument is different for each specific problem (binary or categorical).
  • The shared_variables is a list of variables (i.e., sex, age, etc.), and they contain missing values.

I tried using the {p.std.error} and {p} statistics. Specifically:

confidence_intervals<-function(data, variable, by, tbl, ...){
  p_value_raw <- stringr::str_extract(.x, "(?<=\\()\\d+(?=%\\))")
  se_value_raw <- stringr::str_extract(.x, "(?<=\\)\\s)0\\.\\d+")
  p_value <- suppressWarnings(as.numeric(trimws(p_value_raw)))
  se_value <- suppressWarnings(as.numeric(trimws(se_value_raw)))
  confidence_interval_upper= ((p_value/100)+1.96se_value)
  confidence_interval_lower= ((p_value/100)-1.96se_value)
  confidence_interval<-paste(confidence_interval_lower, confidence_interval_upper)

Then implement this function with add_stat()

本文标签: raddci() for row percentages in gtsummary tblsvysummary() functionStack Overflow