admin管理员组文章数量:1357109
I am trying to write a function that will take a column from a dataframe, and replace some commas with semicolons. Taking the example of a value below (i.e., 1 cell from a dataframe's column)
Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)
Turned into
Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)
I've created the function below, but 1) it does not like the current regex (continue to get errors) and 2) it replaces all values with NA (there are some values that have NA, and I'd like to keep them as such, only replacing values with data in the cells).
semicolon <- function(x) {
as.double(gsub(",(?=[A-Z])",";", x))
}
I am trying to write a function that will take a column from a dataframe, and replace some commas with semicolons. Taking the example of a value below (i.e., 1 cell from a dataframe's column)
Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)
Turned into
Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)
I've created the function below, but 1) it does not like the current regex (continue to get errors) and 2) it replaces all values with NA (there are some values that have NA, and I'd like to keep them as such, only replacing values with data in the cells).
semicolon <- function(x) {
as.double(gsub(",(?=[A-Z])",";", x))
}
Share
Improve this question
asked Mar 27 at 18:25
Kayla SchouKayla Schou
1
3
|
1 Answer
Reset to default 3You need to use a lookbehind to find commas after close parens ("(?<=\\))"
) -- your current approach ("(?=[A-Z])"
) uses a lookahead to find commas immediately before a capital letter. You also need to set perl = TRUE
to use lookarounds. And it's not clear why you're including as.double()
, which will try to coerce the character string to a number and return NA
. So:
semicolon <- function(x) {
gsub("(?<=\\)),", ";", x, perl = TRUE)
}
x <- "Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)"
semicolon(x)
# "Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)"
Edit: As @TimG pointed out in a comment, you don't need the lookbehind if you include the close paren in the replacement:
semicolon <- function(x) gsub("\\),", ");", x)
semicolon(x)
# "Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)"
本文标签: Replacing some commas with semicolonsRStack Overflow
版权声明:本文标题:Replacing some commas with semicolons - R - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744073903a2586388.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
as.double()
? Also in this case, no negative lookaheads are needed, you can also usegsub(")\\, ", ")\\; " , "Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)")
– Tim G Commented Mar 27 at 18:59