admin管理员组

文章数量:1304187

I'm attempting to mutate three numeric columns into percentages based on the sum of each column, using mutate(across(.cols = c(...))). My usual method of doing this for one column works very well:

mutate(`Percentage`= ((`Count`/sum(.$`Count`))*100))

When I apply a similar principle to a multi-column mutate call using .x or . to stand in for all called values, it instead divides each value by itself. Where am I going wrong?

# Sample Code

test<-data.frame(fruit=c("Apples","Pears","Bananas"),
                 `John`=c(1,13,34),
                 `Jacob`=c(5,9,2))%>%
  group_by(`fruit`)%>%
  mutate(`Total`=sum(`John`,`Jacob`))

# A tibble: 3 × 4
# Groups:   fruit [3]
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples      1     5     6
2 Pears      13     9    22
3 Bananas    34     2    36

fruiteaten<-test%>%
  mutate(across(.cols=c(`John`,
                        `Jacob`,
                        `Total`), .fns = ~ ((.x/sum(.x))*100)))

# Output

# A tibble: 3 × 4
# Groups:   fruit [3]
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples    100   100   100
2 Pears     100   100   100
3 Bananas   100   100   100

# Desired Output

  fruit    John Jacob Total
1 Apples   0.02  0.31  0.09
2 Pears    0.27  0.56  0.34
3 Bananas  0.70  0.12  0.56

I'm attempting to mutate three numeric columns into percentages based on the sum of each column, using mutate(across(.cols = c(...))). My usual method of doing this for one column works very well:

mutate(`Percentage`= ((`Count`/sum(.$`Count`))*100))

When I apply a similar principle to a multi-column mutate call using .x or . to stand in for all called values, it instead divides each value by itself. Where am I going wrong?

# Sample Code

test<-data.frame(fruit=c("Apples","Pears","Bananas"),
                 `John`=c(1,13,34),
                 `Jacob`=c(5,9,2))%>%
  group_by(`fruit`)%>%
  mutate(`Total`=sum(`John`,`Jacob`))

# A tibble: 3 × 4
# Groups:   fruit [3]
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples      1     5     6
2 Pears      13     9    22
3 Bananas    34     2    36

fruiteaten<-test%>%
  mutate(across(.cols=c(`John`,
                        `Jacob`,
                        `Total`), .fns = ~ ((.x/sum(.x))*100)))

# Output

# A tibble: 3 × 4
# Groups:   fruit [3]
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples    100   100   100
2 Pears     100   100   100
3 Bananas   100   100   100

# Desired Output

  fruit    John Jacob Total
1 Apples   0.02  0.31  0.09
2 Pears    0.27  0.56  0.34
3 Bananas  0.70  0.12  0.56
Share Improve this question asked Feb 4 at 20:01 Mary RachelMary Rachel 3312 silver badges7 bronze badges 1
  • 1 I suggest you use mutate(Total = John + Jacob). Then you don't need to group by fruit. – Michael Dewar Commented Feb 5 at 5:47
Add a comment  | 

1 Answer 1

Reset to default 4

You should not be grouping:

test |> 
    ungroup() |> 
    mutate(across(-fruit, \(x) x / sum(x) * 100))

# A tibble: 3 × 4
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples   2.08  31.2  9.38
2 Pears   27.1   56.2 34.4 
3 Bananas 70.8   12.5 56.2 

P.s. you only need to use `backticks` for variables that are not legal names, such as names with spaces or illegal characters.

本文标签: rCalculating percentages across multiple columns in dplyrStack Overflow