r - Rstudio: binary list - check probability of previous value changed (or remained the same) - Stack Overflow

IT技术

更新时间：2025-01-0819

admin管理员组
文章数量:1441809

I am new to R. The vector I am analyzing is a list of items (a,a,b,a,b,b,b,a etc.). I need to count probability of a (previous element) becoming b (next element), a becoming a, b becoming a and b becoming b. 4 results need to create a matrix. Can you please advise how to do it? I read about str_detect function, duplicate_count, str_count function, but I do not know how may I apply it for my task.

Thank you in advance for help and reply!

I am new to R. The vector I am analyzing is a list of items (a,a,b,a,b,b,b,a etc.). I need to count probability of a (previous element) becoming b (next element), a becoming a, b becoming a and b becoming b. 4 results need to create a matrix. Can you please advise how to do it? I read about str_detect function, duplicate_count, str_count function, but I do not know how may I apply it for my task.

Thank you in advance for help and reply!

Share Improve this question edited Nov 23, 2024 at 5:24 jpsmith 16.7k6 gold badges20 silver badges43 bronze badges asked Nov 22, 2024 at 21:06 Mike_R 12 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 3

I'll use the following vector as an example:

(seed <- sample(.Machine$integer.max, 1))
#> [1] 2041751758
set.seed(seed) # for reproducibility

(x <- sample(c("a", "b"), 10, TRUE))
#>  [1] "b" "b" "a" "a" "a" "b" "b" "b" "a" "a"

The easiest way is to use table:

table(data.frame(from = x[-length(x)], to = x[-1]))
#>     to
#> from a b
#>    a 3 1
#>    b 2 3

To do the same thing manually, use matrix, tabulate, and Boolean arithmetic. First, encode the four possibilities as integers. For a vectorized solution, x[-length(x)] will be the "from" value in each transition, and x[-1] will be the "to" value. If the "from" value is "a", add 0. If it is "b" add 1. If the "to" value is "a" add 0. If it is "b", add 2. Add 1 to each result to get the values between 1 and 4.

tabulate counts the number of each integer value in a vector. There are four possibilities, so set tabulate's nbins argument to 4L.

Finally, put the results of tabulate in a matrix with 2 rows and 2 columns and set the names as desired. I set mine so that "a->" means a transition started with "a", and "->a" means the transition ended with "a".

matrix(tabulate((x[-length(x)] == "b") + 2L*(x[-1] == "b") + 1L, 4L),
       2, 2, 0, list(c("a->", "b->"), c("->a", "->b")))
#>     ->a ->b
#> a->   3   1
#> b->   2   3

Compare the performance of the two approaches with a larger vector:

x <- sample(c("a", "b"), 1e6, TRUE)

trans1 <- function(x) table(data.frame(from = x[-length(x)], to = x[-1]))

trans2 <- function(x) {
  x <- x == "b"
  matrix(tabulate(x[-length(x)] + 2L*x[-1] + 1L, 4L),
         2, 2, 0, list(c("a->", "b->"), c("->a", "->b")))
}

microbenchmark::microbenchmark(
  table = trans1(x),
  tabulate = trans2(x)
)
#> Unit: milliseconds
#>      expr     min      lq      mean   median        uq      max neval cld
#>     table 84.2272 88.4737 102.27207 91.48075 123.30125 133.5508   100  a 
#>  tabulate 25.1709 26.7004  32.58225 27.66895  30.02525  68.6166   100   b

本文标签： rRstudio binary listcheck probability of previous value changed (or remained the same)Stack Overflow

版权声明：本文标题：r - Rstudio: binary list - check probability of previous value changed (or remained the same) - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736300856a1930971.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

r - Rstudio: binary list - check probability of previous value changed (or remained the same) - Stack Overflow

1 Answer 1

更多相关文章