admin管理员组

文章数量:1312783

I want to include participants as a "random effect" into a log linear model (poisson model), to account for the variance between participants. My data comes from a contingency table and I am predicting expected from observed, and don't have a variable y. As I don't have a variable y (that I would predict from x1, x2,...), I cant used normal mixed effects models (lme4, etc. in R)

Here is some simplified R code (with just two participants): There are three variables:" awareness" and "narrative" - where I want to test the interaction of, while treating participant as random effect.

observed <- c( 7, 2, 8, 22,0, 0, 14, 2,8, 22,0, 0)
awareness <- 
factor(c("aware","aware","aware","unaware","unaware","unaware","aware","aware","aware","unaware","unaware","unaware"))
narrative <- factor(c("M1","MU2","MU3","M1","MU2","MU3","M1","MU2","MU3","M1","MU2","MU3"))
#participant <- factor(c("1", "1", "1","1", "1", "1","2", "2", "2","2", "2", "2")
model1 <- glm(observedawareness*narrative,poisson)
model2 <- glm(observedawareness+narrative,poisson)
anova(model1,model2,test="Chi")
summary(model2)

Right now, the model does not include participant as a random effect, how would I do that?

Or alternatively, with 4 dimensions:

y <-structure(c(7.33, 13.5, 7.5, 8.3, 3.16, 2, 3.3, 5.3, 9.5, 4.2, 9.2, 6.5, 8.33, 14.5, 8.5, 9.3, 4.16, 3, 4.3, 6.3, 10.5, 5.2, 10.2, 7.5), dim = c(2L, 2L, 2L, 3L), dimnames = structure(list(c("ParticipantOne", "ParticipantTwo"), c("above threshold", "below threshold"), c("Healthy controls", "Psychotic observers"), c("M1", "MU2", "MU3")), names = c("Awareness", "Group", "Narrative")), class = "table")
xy <- aperm(y, c(2, 2, 1, 3))
names(dimnames(xy)) <- c("Participant" ,"Group", "Awareness", "Narrative")
ftable(xy)
fourfoldplot(xy, margin = 2)

narrative <- gl(3,4)
group <- gl(2,1,12)
awareness <- gl(2,2,12)
participant <-gl(???)
model1 <- glm(as.vector(xy) ~narrative*group*awareness,poisson)
model2 <- update(model1, ~. -narrative:group:awareness)
anova(model1,model2,test="Chi")

The question is how I would include participants as a random effect into the model.

I want to include participants as a "random effect" into a log linear model (poisson model), to account for the variance between participants. My data comes from a contingency table and I am predicting expected from observed, and don't have a variable y. As I don't have a variable y (that I would predict from x1, x2,...), I cant used normal mixed effects models (lme4, etc. in R)

Here is some simplified R code (with just two participants): There are three variables:" awareness" and "narrative" - where I want to test the interaction of, while treating participant as random effect.

observed <- c( 7, 2, 8, 22,0, 0, 14, 2,8, 22,0, 0)
awareness <- 
factor(c("aware","aware","aware","unaware","unaware","unaware","aware","aware","aware","unaware","unaware","unaware"))
narrative <- factor(c("M1","MU2","MU3","M1","MU2","MU3","M1","MU2","MU3","M1","MU2","MU3"))
#participant <- factor(c("1", "1", "1","1", "1", "1","2", "2", "2","2", "2", "2")
model1 <- glm(observedawareness*narrative,poisson)
model2 <- glm(observedawareness+narrative,poisson)
anova(model1,model2,test="Chi")
summary(model2)

Right now, the model does not include participant as a random effect, how would I do that?

Or alternatively, with 4 dimensions:

y <-structure(c(7.33, 13.5, 7.5, 8.3, 3.16, 2, 3.3, 5.3, 9.5, 4.2, 9.2, 6.5, 8.33, 14.5, 8.5, 9.3, 4.16, 3, 4.3, 6.3, 10.5, 5.2, 10.2, 7.5), dim = c(2L, 2L, 2L, 3L), dimnames = structure(list(c("ParticipantOne", "ParticipantTwo"), c("above threshold", "below threshold"), c("Healthy controls", "Psychotic observers"), c("M1", "MU2", "MU3")), names = c("Awareness", "Group", "Narrative")), class = "table")
xy <- aperm(y, c(2, 2, 1, 3))
names(dimnames(xy)) <- c("Participant" ,"Group", "Awareness", "Narrative")
ftable(xy)
fourfoldplot(xy, margin = 2)

narrative <- gl(3,4)
group <- gl(2,1,12)
awareness <- gl(2,2,12)
participant <-gl(???)
model1 <- glm(as.vector(xy) ~narrative*group*awareness,poisson)
model2 <- update(model1, ~. -narrative:group:awareness)
anova(model1,model2,test="Chi")

The question is how I would include participants as a random effect into the model.

Share Improve this question edited Feb 1 at 17:06 jay.sf 73.8k8 gold badges63 silver badges125 bronze badges asked Feb 1 at 14:26 Marianne BroekerMarianne Broeker 112 bronze badges 1
  • I don't really follow your question. You first say you're predicting expected from observed, which would suggest that expected is the y variable, but then you say you don't have a y variable. In the model, it looks as though observed is the outcome variable, but the formula is missing the ~ in model 1 and model 2. I am not sure how your hypothetical example matches up to your real-world data. – DaveArmstrong Commented Feb 1 at 15:24
Add a comment  | 

1 Answer 1

Reset to default 3

I couldn't reproduce how you transpose with aperm, but as.data.frame.table worked for me.

## make data frame
df <- as.data.frame.table(y) |> 
  setNames(c("awareness", "group", "participant", "narrative", "y"))

It would then also be a good idea to use the data frame in the models.

To get numbers as factor levels, do:

## integerize levels
fac <- sapply(df, is.factor)
df[fac] <- lapply(df[fac], \(x) as.integer(x) |> as.factor())

Now, turning to the actual problem, for your glms, since they look poisson-ish but are continuous, you probably want quasipoisson (and the warnings will vanish).

model1 <- glm(y ~ narrative*group*awareness, data=df, quasipoisson)
model2 <- update(model1, ~ . -narrative:group:awareness)

> anova(model1, model2, test="Chi")
Analysis of Deviance Table

Model 1: y ~ narrative * group * awareness
Model 2: y ~ narrative + group + awareness + narrative:group + narrative:awareness + 
    group:awareness
  Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1        12     26.229                     
2        14     27.265 -2  -1.0364   0.7766

I'm not aware of quasipoissons with lme4 or glmmTMB random effects, but glmmTMB with family=Gamma(link='log') could be suitable; I think it fits well the y which is positive and right-skewed.

library(glmmTMB)
model3 <- glmmTMB(y ~ narrative*group*awareness + (1 | participant), 
               data=df, family=Gamma(link='log')) 

Gives:

> summary(model3)
 Family: Gamma  ( log )
Formula:          y ~ narrative * group * awareness + (1 | participant)
Data: df

     AIC      BIC   logLik deviance df.resid 
   143.5    160.0    -57.8    115.5       10 

Random effects:

Conditional model:
 Groups      Name        Variance  Std.Dev. 
 participant (Intercept) 7.231e-11 8.503e-06
Number of obs: 24, groups:  participant, 2

Dispersion estimate for Gamma family (sigma^2): 0.167 

Conditional model:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   1.657276   0.288652   5.741 9.39e-09 ***
narrative2                    0.530459   0.408216   1.299   0.1938    
narrative3                    0.334699   0.408216   0.820   0.4123    
group2                        0.029123   0.408216   0.071   0.9431    
awareness2                    0.390416   0.408216   0.956   0.3389    
narrative2:group2            -0.036440   0.577305  -0.063   0.9497    
narrative3:group2            -0.040096   0.577305  -0.069   0.9446    
narrative2:awareness2        -0.342775   0.577305  -0.594   0.5527    
narrative3:awareness2        -0.971404   0.577305  -1.683   0.0924 .  
group2:awareness2            -0.159892   0.577305  -0.277   0.7818    
narrative2:group2:awareness2 -0.001305   0.816432  -0.002   0.9987    
narrative3:group2:awareness2  0.691400   0.816432   0.847   0.3971    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

To be sure if a gamma distribution fits best, you can still ask on Cross Validated.


Data (from OP):

y <- structure(c(7.33, 13.5, 7.5, 8.3, 3.16, 2, 3.3, 5.3, 9.5, 4.2, 
9.2, 6.5, 8.33, 14.5, 8.5, 9.3, 4.16, 3, 4.3, 6.3, 10.5, 5.2, 
10.2, 7.5), dim = c(2L, 2L, 2L, 3L), dimnames = structure(list(
    c("ParticipantOne", "ParticipantTwo"), c("above threshold", 
    "below threshold"), c("Healthy controls", "Psychotic observers"
    ), c("M1", "MU2", "MU3")), names = c("Awareness", "Group", 
"Narrative", NA)), class = "table")

本文标签: Integrating participant as a random effect into log linearpoisson modelin RStack Overflow