admin管理员组文章数量:1313272
I have a very messy date column like this in excel:
For example, the row which the row_id=23 in excel is input as dd/mm/yy format, so it should be read as 2023-09-07 when I import this worksheet into R, but now, when I use rio::read to import this dataset into R, R automatically convert it as numeric format as "45116", and when I use this code :
df_dl2 <- df_dl %>%
mutate(across(c(`specimen taken`, `specimen received at NHL`),
function(x) as_datetime(ifelse(grepl("^[0-9\\.]+$", x),
as.numeric(x) |> as_date(origin = "1899-12-30") |> as_datetime(),
as_datetime(x, format = "%d-%m-%Y"))),
.names = "cleaned_{.col}"))
to convert it back to the dd/mm/yy format, it shows as 2023-07-09 instead of the correct one:2023-09-07.
It happens on all those numerical stored date.They are all input in excel as 1/8/23, 1/9/23... they should be all read as dd/mm/yy but excel and r both think they are mm/dd/yy format and convert it as this rule.
I am confused from where I can tell either excel or R that this is not U.S. date format mm/dd/yy. It looks like when the dataset was read by r, it has been converted as numerical format date as mm/dd/yy. How can I get the correct convert in R?
I have a very messy date column like this in excel:
For example, the row which the row_id=23 in excel is input as dd/mm/yy format, so it should be read as 2023-09-07 when I import this worksheet into R, but now, when I use rio::read to import this dataset into R, R automatically convert it as numeric format as "45116", and when I use this code :
df_dl2 <- df_dl %>%
mutate(across(c(`specimen taken`, `specimen received at NHL`),
function(x) as_datetime(ifelse(grepl("^[0-9\\.]+$", x),
as.numeric(x) |> as_date(origin = "1899-12-30") |> as_datetime(),
as_datetime(x, format = "%d-%m-%Y"))),
.names = "cleaned_{.col}"))
to convert it back to the dd/mm/yy format, it shows as 2023-07-09 instead of the correct one:2023-09-07.
It happens on all those numerical stored date.They are all input in excel as 1/8/23, 1/9/23... they should be all read as dd/mm/yy but excel and r both think they are mm/dd/yy format and convert it as this rule.
I am confused from where I can tell either excel or R that this is not U.S. date format mm/dd/yy. It looks like when the dataset was read by r, it has been converted as numerical format date as mm/dd/yy. How can I get the correct convert in R?
Share Improve this question edited Jan 31 at 15:00 ThomasIsCoding 103k9 gold badges36 silver badges101 bronze badges asked Jan 31 at 14:33 RstudyerRstudyer 4772 silver badges9 bronze badges 2- 1 Could you edit your question so others can reproduce this issue? Images of data are highly discouraged for these reasons. Good luck! – jpsmith Commented Jan 31 at 14:43
- 1 it would be better if you could provide a simplistic example, where you can provide a string array with different expressions of dates, and provide the corresponding desired output as well – ThomasIsCoding Commented Jan 31 at 14:45
1 Answer
Reset to default 1Probably you can find some interesting clues here
> as.Date(rowSums(sapply(fmts, as.Date, x = s), TRUE), origin = "1970-01-01")
[1] "2023-08-14" "2023-09-01" "2023-08-17" "2023-09-07"
or using coalesce
> do.call(dplyr::coalesce, lapply(fmts, as.Date, x = s))
[1] "2023-08-14" "2023-09-01" "2023-08-17" "2023-09-07"
Data
> s <- c("14-08-2023", "1/9/23", "17-08-2023", "7/9/23")
> fmts <- c("%d/%m/%y", "%d-%m-%Y")
本文标签: rmixed date converted from default mmddyy in excel to ddmmyyStack Overflow
版权声明:本文标题:r - mixed date converted from default mmddyy in excel to ddmmyy - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741913582a2404588.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论