admin管理员组文章数量:1377511
I have data that looks like this:
expected_data
resp_migration_status | kmcluster | percentage | expected |
---|---|---|---|
1 Non-migrant | 1 | 21.9 | 30.5 |
2 Non-migrant | 2 | 30.1 | 27.4 |
3 Non-migrant | 3 | 24.7 | 19.9 |
4 Non-migrant | 4 | 23.3 | 22.3 |
5 Migrant | 1 | 41.9 | 30.5 |
6 Migrant | 2 | 22.6 | 27.4 |
7 Migrant | 3 | 19.4 | 19.9 |
8 Migrant | 4 | 16.1 | 22.3 |
9 Displaced | 1 | 36.9 | 30.5 |
10 Displaced | 2 | 26.2 | 27.4 |
11 Displaced | 3 | 11.9 | 19.9 |
12 Displaced | 4 | 25 | 22.3 |
I have data that looks like this:
expected_data
resp_migration_status | kmcluster | percentage | expected |
---|---|---|---|
1 Non-migrant | 1 | 21.9 | 30.5 |
2 Non-migrant | 2 | 30.1 | 27.4 |
3 Non-migrant | 3 | 24.7 | 19.9 |
4 Non-migrant | 4 | 23.3 | 22.3 |
5 Migrant | 1 | 41.9 | 30.5 |
6 Migrant | 2 | 22.6 | 27.4 |
7 Migrant | 3 | 19.4 | 19.9 |
8 Migrant | 4 | 16.1 | 22.3 |
9 Displaced | 1 | 36.9 | 30.5 |
10 Displaced | 2 | 26.2 | 27.4 |
11 Displaced | 3 | 11.9 | 19.9 |
12 Displaced | 4 | 25 | 22.3 |
I'd like to construct a bar graph which shows percentage by kmcluster and over resp_migration_status. I've done this successfully using this code:
ggplot(expected_data, aes(x = resp_migration_status, y = percentage, fill = kmcluster)) +
geom_bar(stat = "identity", position = "dodge") + # Use stat = "identity" for pre-computed values
labs(
title = "Percentage distribution of network cluster by migration status",
x = "Migration Status",
y = "Percentage",
fill = "Cluster"
) +
scale_y_continuous(labels = scales::percent_format(scale = 1)) + # Format y-axis as percentages
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
Overlayed on this bar graph, I'd like to do another graph with black outlines for the bars, which shows the expected percentage by kmcluster and over resp_migration_status. Essentially, it's a graphical representation of a chi-square test: understanding what the distribution of cluster would be by migration type if it was perfectly random, compared to the 'actual' distribution where some migration types are disproportionately in one cluster.
How do I overlay a very basic (black outlined) bar graph on the original graph to represent this? I have this code:
ggplot(expected_data, aes(x = resp_migration_status, y = expected, fill = kmcluster)) +
geom_bar(stat = "identity", position = "dodge", color = "black", fill = NA) + # Use stat = "identity" for pre-computed values, bars with black outlines
labs(
title = "Expected percentage distribution of network cluster by migration status",
x = "Migration Status",
y = "Percentage"
) +
scale_y_continuous(labels = scales::percent_format(scale = 1)) + # Format y-axis as percentages
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
But adding fill = NA inside geom_bar overrides the fill=cluster in the aes, such that it no longer divides the data across cluster types and it makes it into some strange stacked bar (see image).
So the first question is:
- How do I divide the data by migration type and cluster, without coloring in each bar and instead just outlining them in black?
Secondly:
- How do I overlay this bar graph on top of the original one?
1 Answer
Reset to default 1To add your second bars on top of the first you have to explicitly map on the group
aes to still get a dodged bar chart.
library(ggplot2)
ggplot(expected_data, aes(
x = resp_migration_status,
y = percentage, fill = factor(kmcluster)
)) +
geom_col(position = "dodge") +
geom_col(aes(y = expected, group = factor(kmcluster)),
color = "black", fill = NA, position = "dodge"
) +
labs(
title = "Percentage distribution of network cluster by migration status",
x = "Migration Status",
y = "Percentage",
fill = "Cluster"
) +
scale_y_continuous(labels = scales::percent_format(scale = 1)) + # Format y-axis as percentages
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
DATA
expected_data <- data.frame(
resp_migration_status = c(
"Non-migrant", "Non-migrant", "Non-migrant", "Non-migrant",
"Migrant", "Migrant", "Migrant", "Migrant",
"Displaced", "Displaced", "Displaced", "Displaced"
),
kmcluster = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L),
percentage = c(
21.9, 30.1, 24.7,
23.3, 41.9, 22.6, 19.4, 16.1, 36.9, 26.2, 11.9, 25
),
expected = c(
30.5, 27.4, 19.9,
22.3, 30.5, 27.4, 19.9, 22.3, 30.5, 27.4, 19.9, 22.3
)
)
本文标签: rHow can I layer an outlined bar graph on top of a colored bar graph in ggplotStack Overflow
版权声明:本文标题:r - How can I layer an outlined bar graph on top of a colored bar graph in ggplot? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744450227a2606708.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
dput()
so we can copy/paste it directly into R for testing. See how to create a reproducible example. – MrFlick Commented Mar 19 at 14:57