admin管理员组文章数量:1287655
In a new variable row2
, how to repeat a sequential numbering (here a sequence from 3 to 6) by group of duplicated row1
values, which would start from a given value (here from row1
= 3), even if the last sequence is incomplete (here 3 to 5 for example)?
Thanks for help
Desired output:
> dat1
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3 # start the sequence
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3 # repeat the sequence
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3 # and repeat again...
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5 # ...even if incomplete
Initial data:
row1 <- c(1,1,2,
3,4,4,4,5,5,6,6,6,
7,7,8,8,9,9,9,10,
11,11,11,12,13,13)
dat1 <- data.frame(row1)
In a new variable row2
, how to repeat a sequential numbering (here a sequence from 3 to 6) by group of duplicated row1
values, which would start from a given value (here from row1
= 3), even if the last sequence is incomplete (here 3 to 5 for example)?
Thanks for help
Desired output:
> dat1
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3 # start the sequence
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3 # repeat the sequence
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3 # and repeat again...
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5 # ...even if incomplete
Initial data:
row1 <- c(1,1,2,
3,4,4,4,5,5,6,6,6,
7,7,8,8,9,9,9,10,
11,11,11,12,13,13)
dat1 <- data.frame(row1)
Share
Improve this question
asked Feb 23 at 8:52
denisdenis
8025 silver badges14 bronze badges
1
|
3 Answers
Reset to default 2You could use if_else
to apply modulo
to val >=3
(row1 - 3) %% 4
cycles through 1,2,3, effectively mapping row1 values into 3,4,5,6 repeatedly.+3
shifts the sequence to start at 3.Values of
row1 < 3
are kept untouched
dat1$row2 <- if_else(dat1$row1 >= 3, (dat1$row1 - 3) %% 4 + 3, dat1$row1)
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5
We can do like this:
- For each row, if the value in row1 is less than 3, then row2 is set equal to row1
- For rows where row1 is 3 or greater, we perform a calculation to assign a cyclic sequence.
- Extracting and Filtering Unique Values
%>% .[. >= 3]
filters the sorted vector to only include those values that are greater than or equal to 3- Matching to Get the Group Position
- Zero-Indexing the Position:
( ... - 1)
- Cycling the Sequence with the modulo operator:
%% 4
- Finally shifting to the Desired Start Value:
+ 3
library(dplyr)
dat1 %>%
mutate(row2 = if_else(row1 < 3,
row1,
(
(match(
row1, sort(unique(row1)) %>%
.[. >= 3]) - 1) %% 4
) + 3
)
)
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5
You might want to write a more concise version from
dat1 |>
transform(row2 = {
i = row1 < 3
c(row1[i], with(rle(row1[!i]), rep(rep(3:6, length.out=length(lengths)), lengths)))
})
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5
If yoou like to apply to your data from previous question, we can wrap operations depending on group pdf
in a single tapply()
- or by()
-call, e.g.
tapply(dat0, ~pdf, \(x) {
x$row1 = with(rle(x$row0), rep(seq_along(values), lengths))
x$row2 = c(x$row1[x$row1 < 3], with(rle(x$row1[!x$row1 < 3]), rep(rep(3:6, length.out=length(lengths)), lengths)))
x
}) |> do.call(what='rbind') |> `row.names<-`(NULL) # cosmetics
if this
pdf page row0 row1 row2
1 x 3 5 1 1
2 x 3 5 1 1
3 x 3 5 1 1
4 x 3 5 1 1
5 x 3 6 2 2
6 x 3 6 2 2
7 x 3 6 2 2
8 x 3 7 3 3
9 x 3 7 3 3
10 x 4 1 4 4
11 x 4 1 4 4
12 x 4 1 4 4
13 x 4 2 5 5
14 x 4 2 5 5
15 x 4 2 5 5
16 x 4 2 5 5
17 x 4 3 6 6
18 y 6 2 1 1
19 y 6 2 1 1
20 y 6 3 2 2
21 y 6 3 2 2
22 y 6 3 2 2
23 y 6 4 3 3
24 y 6 4 3 3
25 y 7 1 4 4
26 y 7 1 4 4
27 y 7 1 4 4
28 y 7 1 4 4
29 y 7 2 5 5
30 y 7 2 5 5
31 y 7 2 5 5
32 y 7 3 6 6
33 y 8 1 7 3
34 y 8 1 7 3
35 y 8 2 8 4
is desired result. Have a look on rows 32-35. (Might be better to re-name row0
-3
to col0
-3
.)
The first anonymous function is very useful. We can wrap it in a custom function:
consecutive_id = \(x) with(rle(x), rep(seq_along(values), lengths))
版权声明:本文标题:numbers - Repeat a sequential numbering of duplicated values starting from a given value in R - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741317140a2371981.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
<3
. – Friede Commented Feb 23 at 9:26