dplyr - Take data from one dataframe and make another datsframe in R - Stack Overflow

IT技术

更新时间：2025-04-210

admin管理员组
文章数量:1415697

I have a table like this (below).

Gene_name	Sample_name	Gene_fraction
IGHV1-11	sample_1	0.00057491
IGHV1-12	sample_2	0.0044843
IGHV1-15	sample_3	0.01253306
IGHV1-18	sample_4	0.00942854
IGHV1-19	sample_5	0.01747729
IGHV1-2	sample_6	0.00034495
IGHV1-11	sample_7	0.00103484
IGHV1-13	sample_8	0.01517765
IGHV1-16	sample_9	0.00758882
IGHV1-18	sample_10	0.00827872

I have a table like this (below).

Gene_name	Sample_name	Gene_fraction
IGHV1-11	sample_1	0.00057491
IGHV1-12	sample_2	0.0044843
IGHV1-15	sample_3	0.01253306
IGHV1-18	sample_4	0.00942854
IGHV1-19	sample_5	0.01747729
IGHV1-2	sample_6	0.00034495
IGHV1-11	sample_7	0.00103484
IGHV1-13	sample_8	0.01517765
IGHV1-16	sample_9	0.00758882
IGHV1-18	sample_10	0.00827872

How to transform the above table to this table like this (below) in R?

Sample_name	IGHV1-11	IGHV1-12	IGHV1-15	IGHV1-18	IGHV1-19	IGHV1-2	IGHV1-13	IGHV1-16
sample_1WT	0.00057491	0.0044843	0	0	0	0	0	0
sample_2WT	0	0.0044843	0	0	0	0	0	0
sample_3WT	0	0	0.01253306	0	0	0	0	0
sample_4MT	0	0	0	0.00942854	0	0	0	0
sample_5WT	0	0	0	0	0.01747729	0	0	0
sample_6WT	0	0	0	0	0	0.00034495	0	0
sample_7MT	0.00103484	0	0	0	0	0	0	0
sample_8WT	0	0	0	0	0	0	0.01517765	0
sample_9MT	0	0	0	0	0	0	0	0.00758882
sample_10MT	0	0	0	0.00827872	0	0	0	0
sample_11MT	0	0	0	0	0.04679775	0	0	0

Should I iterate over each row and append the values into a new dataframe?

Thanks

Share Improve this question asked Feb 4 at 19:16 user5029313 213 bronze badges

3 What is the rule for appending either MT or WT to the sample_* values? Using the tidyr package, this will get you some of the way: df |> tidyr::pivot_wider(id_cols = Sample_name, names_from = "Gene_name", values_from = "Gene_fraction", values_fill = 0). Note that "-" is a special character so will will either have to use backticks e.g. `` ` `` when calling your column names, or if possible, replace the "-" with underscores. – L Tyrone Commented Feb 4 at 19:34
This worked. Thank you. The WT and MT I added in the main df manually, so don't need add that programatically. – user5029313 Commented Feb 4 at 20:34

Add a comment |

2 Answers 2

Sorted by: Reset to default 3

You can simply use xtabs if you don't mind the table format rather than dataframe

> t(xtabs(Gene_fraction ~ ., df))
           Gene_name
Sample_name   IGHV1-11   IGHV1-12   IGHV1-13   IGHV1-15   IGHV1-16   IGHV1-18
  sample_1  0.00057491 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
  sample_10 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00827872
  sample_2  0.00000000 0.00448430 0.00000000 0.00000000 0.00000000 0.00000000
  sample_3  0.00000000 0.00000000 0.00000000 0.01253306 0.00000000 0.00000000
  sample_4  0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00942854
  sample_5  0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
  sample_6  0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
  sample_7  0.00103484 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
  sample_8  0.00000000 0.00000000 0.01517765 0.00000000 0.00000000 0.00000000
  sample_9  0.00000000 0.00000000 0.00000000 0.00000000 0.00758882 0.00000000
           Gene_name
Sample_name   IGHV1-19    IGHV1-2
  sample_1  0.00000000 0.00000000
  sample_10 0.00000000 0.00000000
  sample_2  0.00000000 0.00000000
  sample_3  0.00000000 0.00000000
  sample_4  0.00000000 0.00000000
  sample_5  0.01747729 0.00000000
  sample_6  0.00000000 0.00034495
  sample_7  0.00000000 0.00000000
  sample_8  0.00000000 0.00000000
  sample_9  0.00000000 0.00000000

You can use tidyverse's pivot_wider, base R's reshape or dcast from reshape2 :

# reshape 2
df3 <- reshape2::dcast(df,Sample_name ~ Gene_name,value.var="Gene_fraction",fill=0)

# tidyverse
library(tidyverse)

df1 <- df %>%
  pivot_wider(names_from = Gene_name, values_from = Gene_fraction, values_fill = 0)

# Base R
df2 <- reshape(df, idvar = "Sample_name", timevar = "Gene_name", direction = "wide") # Pivot to wide format
colnames(df2) <- gsub("Gene_fraction.", "", colnames(df2)) # removing Gene_fraction. from cols
df2[is.na(df2)] <- 0

Testdata

df <- data.frame(
  Gene_name = c("IGHV1-11", "IGHV1-12", "IGHV1-15", "IGHV1-18", "IGHV1-19", 
                "IGHV1-2", "IGHV1-11", "IGHV1-13", "IGHV1-16", "IGHV1-18"),
  Sample_name = c("sample_1", "sample_2", "sample_3", "sample_4", "sample_5", 
                  "sample_6", "sample_7", "sample_8", "sample_9", "sample_10"),
  Gene_fraction = c(0.00057491, 0.0044843, 0.01253306, 0.00942854, 0.01747729, 
                    0.00034495, 0.00103484, 0.01517765, 0.00758882, 0.00827872)
)

本文标签： dplyrTake data from one dataframe and make another datsframe in RStack Overflow

版权声明：本文标题：dplyr - Take data from one dataframe and make another datsframe in R - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745237749a2649138.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

dplyr - Take data from one dataframe and make another datsframe in R - Stack Overflow

2 Answers 2

Testdata

更多相关文章