admin管理员组文章数量:1415697
I have a table like this (below).
Gene_name | Sample_name | Gene_fraction |
---|---|---|
IGHV1-11 | sample_1 | 0.00057491 |
IGHV1-12 | sample_2 | 0.0044843 |
IGHV1-15 | sample_3 | 0.01253306 |
IGHV1-18 | sample_4 | 0.00942854 |
IGHV1-19 | sample_5 | 0.01747729 |
IGHV1-2 | sample_6 | 0.00034495 |
IGHV1-11 | sample_7 | 0.00103484 |
IGHV1-13 | sample_8 | 0.01517765 |
IGHV1-16 | sample_9 | 0.00758882 |
IGHV1-18 | sample_10 | 0.00827872 |
I have a table like this (below).
Gene_name | Sample_name | Gene_fraction |
---|---|---|
IGHV1-11 | sample_1 | 0.00057491 |
IGHV1-12 | sample_2 | 0.0044843 |
IGHV1-15 | sample_3 | 0.01253306 |
IGHV1-18 | sample_4 | 0.00942854 |
IGHV1-19 | sample_5 | 0.01747729 |
IGHV1-2 | sample_6 | 0.00034495 |
IGHV1-11 | sample_7 | 0.00103484 |
IGHV1-13 | sample_8 | 0.01517765 |
IGHV1-16 | sample_9 | 0.00758882 |
IGHV1-18 | sample_10 | 0.00827872 |
How to transform the above table to this table like this (below) in R?
Sample_name | IGHV1-11 | IGHV1-12 | IGHV1-15 | IGHV1-18 | IGHV1-19 | IGHV1-2 | IGHV1-13 | IGHV1-16 |
---|---|---|---|---|---|---|---|---|
sample_1WT | 0.00057491 | 0.0044843 | 0 | 0 | 0 | 0 | 0 | 0 |
sample_2WT | 0 | 0.0044843 | 0 | 0 | 0 | 0 | 0 | 0 |
sample_3WT | 0 | 0 | 0.01253306 | 0 | 0 | 0 | 0 | 0 |
sample_4MT | 0 | 0 | 0 | 0.00942854 | 0 | 0 | 0 | 0 |
sample_5WT | 0 | 0 | 0 | 0 | 0.01747729 | 0 | 0 | 0 |
sample_6WT | 0 | 0 | 0 | 0 | 0 | 0.00034495 | 0 | 0 |
sample_7MT | 0.00103484 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
sample_8WT | 0 | 0 | 0 | 0 | 0 | 0 | 0.01517765 | 0 |
sample_9MT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.00758882 |
sample_10MT | 0 | 0 | 0 | 0.00827872 | 0 | 0 | 0 | 0 |
sample_11MT | 0 | 0 | 0 | 0 | 0.04679775 | 0 | 0 | 0 |
Should I iterate over each row and append the values into a new dataframe?
Thanks
Share Improve this question asked Feb 4 at 19:16 user5029313user5029313 213 bronze badges 2 |2 Answers
Reset to default 3You can simply use xtabs
if you don't mind the table format rather than dataframe
> t(xtabs(Gene_fraction ~ ., df))
Gene_name
Sample_name IGHV1-11 IGHV1-12 IGHV1-13 IGHV1-15 IGHV1-16 IGHV1-18
sample_1 0.00057491 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
sample_10 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00827872
sample_2 0.00000000 0.00448430 0.00000000 0.00000000 0.00000000 0.00000000
sample_3 0.00000000 0.00000000 0.00000000 0.01253306 0.00000000 0.00000000
sample_4 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00942854
sample_5 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
sample_6 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
sample_7 0.00103484 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
sample_8 0.00000000 0.00000000 0.01517765 0.00000000 0.00000000 0.00000000
sample_9 0.00000000 0.00000000 0.00000000 0.00000000 0.00758882 0.00000000
Gene_name
Sample_name IGHV1-19 IGHV1-2
sample_1 0.00000000 0.00000000
sample_10 0.00000000 0.00000000
sample_2 0.00000000 0.00000000
sample_3 0.00000000 0.00000000
sample_4 0.00000000 0.00000000
sample_5 0.01747729 0.00000000
sample_6 0.00000000 0.00034495
sample_7 0.00000000 0.00000000
sample_8 0.00000000 0.00000000
sample_9 0.00000000 0.00000000
You can use tidyverse's pivot_wider
, base R's reshape
or dcast
from reshape2
:
# reshape 2
df3 <- reshape2::dcast(df,Sample_name ~ Gene_name,value.var="Gene_fraction",fill=0)
# tidyverse
library(tidyverse)
df1 <- df %>%
pivot_wider(names_from = Gene_name, values_from = Gene_fraction, values_fill = 0)
# Base R
df2 <- reshape(df, idvar = "Sample_name", timevar = "Gene_name", direction = "wide") # Pivot to wide format
colnames(df2) <- gsub("Gene_fraction.", "", colnames(df2)) # removing Gene_fraction. from cols
df2[is.na(df2)] <- 0
Testdata
df <- data.frame(
Gene_name = c("IGHV1-11", "IGHV1-12", "IGHV1-15", "IGHV1-18", "IGHV1-19",
"IGHV1-2", "IGHV1-11", "IGHV1-13", "IGHV1-16", "IGHV1-18"),
Sample_name = c("sample_1", "sample_2", "sample_3", "sample_4", "sample_5",
"sample_6", "sample_7", "sample_8", "sample_9", "sample_10"),
Gene_fraction = c(0.00057491, 0.0044843, 0.01253306, 0.00942854, 0.01747729,
0.00034495, 0.00103484, 0.01517765, 0.00758882, 0.00827872)
)
本文标签: dplyrTake data from one dataframe and make another datsframe in RStack Overflow
版权声明:本文标题:dplyr - Take data from one dataframe and make another datsframe in R - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745237749a2649138.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
tidyr
package, this will get you some of the way:df |> tidyr::pivot_wider(id_cols = Sample_name, names_from = "Gene_name", values_from = "Gene_fraction", values_fill = 0)
. Note that "-" is a special character so will will either have to use backticks e.g. `` ` `` when calling your column names, or if possible, replace the "-" with underscores. – L Tyrone Commented Feb 4 at 19:34