admin管理员组

文章数量:1410697

I have this data

library(ggeffects) # added lib 
library(tidyverse) # added lib 
data(efc, package = "ggeffects")
efc<-efc %>% na.omit()

and I run this regression

efc <- datawizard::to_factor(efc, c("c161sex", "c172code"))
mod <- lm(barthtot ~ c12hour + c161sex * c172code , efc)

Then I run the following commands

mydf <- predict_response(mod, terms = c( "c161sex","c172code"))
plot(mydf) 

I want to replicate this graph using the ggplot but I do not want t do that using the predicted values from mydf.

So I am doing something like that

library(marginaleffects) # added lib for "predictions"
efc$pred1 <- predictions(mod)[,2]
ggplot(efc) + geom_point(aes(x = c161sex, y = barthtot, color =c172code))  + facet_wrap(vars(c172code))+
  geom_line(aes(x =c161sex, y = pred1, color= c172code))  

But the resulting plot is not the same as that obtained from mydf.

I have this data

library(ggeffects) # added lib 
library(tidyverse) # added lib 
data(efc, package = "ggeffects")
efc<-efc %>% na.omit()

and I run this regression

efc <- datawizard::to_factor(efc, c("c161sex", "c172code"))
mod <- lm(barthtot ~ c12hour + c161sex * c172code , efc)

Then I run the following commands

mydf <- predict_response(mod, terms = c( "c161sex","c172code"))
plot(mydf) 

I want to replicate this graph using the ggplot but I do not want t do that using the predicted values from mydf.

So I am doing something like that

library(marginaleffects) # added lib for "predictions"
efc$pred1 <- predictions(mod)[,2]
ggplot(efc) + geom_point(aes(x = c161sex, y = barthtot, color =c172code))  + facet_wrap(vars(c172code))+
  geom_line(aes(x =c161sex, y = pred1, color= c172code))  

But the resulting plot is not the same as that obtained from mydf.

Share edited Mar 9 at 21:09 Tim G 6,2201 gold badge3 silver badges19 bronze badges asked Mar 9 at 20:35 mariannmariann 331 silver badge5 bronze badges 3
  • 1 Where does function predictions come from? predictionspredictions is not a base R function. When using functions that are not base R functions please start the scripts with a call to library(pkgname) in order to load the packages needed. – Rui Barradas Commented Mar 9 at 20:49
  • 1 @RuiBarradas marginaleffects::predictions but we don't need to load an additional package. efc$pred1 <- stats::predict(mod) would give the same results. – M-- Commented Mar 9 at 21:13
  • Hi guys, thank you for your comments. @RuiBarradas I am already familiar with ggpredict but as @M said this produces the same result with predictions. – mariann Commented Mar 9 at 21:28
Add a comment  | 

2 Answers 2

Reset to default 3

When you call this:

predictions(mod)

The marginaleffects function will return one prediction for every row in the original dataset. Clearly, that is not what you want to plot.

If you look at the documentation for ggeffects, you'll note that what your command does is compute predicted value for every unique combination of c161sex and c172code while holding all other predictors at their means or modes.

You can achieve the same result in predictions() using the newdata argument and the datagrid() helper function.

library(tidyverse)
library(marginaleffects)
data(efc, package = "ggeffects")
efc <- efc %>% na.omit()
efc <- datawizard::to_factor(efc, c("c161sex", "c172code"))
mod <- lm(barthtot ~ c12hour + c161sex * c172code, efc)

pred <- predictions(mod, newdata = datagrid(
  c161sex = unique, c172code = unique))

ggplot(pred) +
  geom_pointrange(
    aes(
      x = c161sex,
      y = estimate,
      ymax = conf.high,
      ymin = conf.low,
      color = c172code),
    position = position_dodge(width = 2)) +
  facet_wrap(~c161sex)

You can replicate this in vanilla ggplot as follows.

First create your model

mod <- lm(barthtot ~ c12hour + c161sex * c172code , efc)

Now use expand.grid to create a little data frame with all levels of c161sex and c172code, with c12hour held at its mean value:

pred_df <- expand.grid(c12hour = mean(efc$c12hour),
                       c161sex = unique(efc$c161sex),
                       c172code = unique(efc$c172code))

We can now get predictions for these factor levels:

preds <- predict(mod, pred_df, se.fit = TRUE)

Next, we add these predictions to the little data frame, using the se.fit to get the lower and upper confidence intervals:

pred_df$barthtot <- preds$fit
pred_df$upper <- preds$fit + preds$se.fit * qnorm(0.025)
pred_df$lower <- preds$fit + preds$se.fit * qnorm(0.975)

Finally, we can draw our plot with this data frame using ggplot with geom_linerange and geom_point. I have styled the plot to look similar to the output of predict_response

ggplot(pred_df, aes(c161sex, barthtot, colour = c172code)) +
  geom_linerange(aes(ymin = lower, ymax = upper), 
                 position = position_dodge(0.2)) +
  geom_point(position = position_dodge(0.2)) +
  scale_color_brewer(palette = "Set1") +
  scale_x_discrete(expand = c(0, 0.2)) +
  ggtitle("Predicted values of barthtot") +
  theme_minimal() +
  theme(axis.line = element_line(linewidth = 0.3, colour = "gray"))

本文标签: rPlot using ggplot and ggeffectStack Overflow