admin管理员组文章数量:1192555
I have sorted start indices (included) and end indices (excluded) of intervals (obtained by using seachsorted
), for instance:
import numpy as np
# Both arrays are of same size, and sorted.
# Size of arrays is number of intervals.
# Intervals do not overlap.
# interval indices: 0 1 2 3 4 5
interval_start_idxs = np.array([0, 3, 3, 3, 6, 7])
interval_end_excl_idxs = np.array([2, 4, 4, 4, 7, 9])
An empty interval is identified when
interval_start_idxs[interval_idx] == interval_end_excl_idxs[interval_idx]-1
I would like to identify the starts and ends of each region where intervals are empty. A region is made with one or several intervals sharing the same start indices and end excluded indices.
With previous data, expected result would then be:
empty_interval_starts = [1, 4] # start is included
empty_intervals_ends_excl = [4, 5] # end is excluded
This result is to be understood as:
- intervals from index 1 to 3, these intervals are a same region of empty intervals
- and interval at index 4 is a separate region on its own
I have sorted start indices (included) and end indices (excluded) of intervals (obtained by using seachsorted
), for instance:
import numpy as np
# Both arrays are of same size, and sorted.
# Size of arrays is number of intervals.
# Intervals do not overlap.
# interval indices: 0 1 2 3 4 5
interval_start_idxs = np.array([0, 3, 3, 3, 6, 7])
interval_end_excl_idxs = np.array([2, 4, 4, 4, 7, 9])
An empty interval is identified when
interval_start_idxs[interval_idx] == interval_end_excl_idxs[interval_idx]-1
I would like to identify the starts and ends of each region where intervals are empty. A region is made with one or several intervals sharing the same start indices and end excluded indices.
With previous data, expected result would then be:
empty_interval_starts = [1, 4] # start is included
empty_intervals_ends_excl = [4, 5] # end is excluded
This result is to be understood as:
- intervals from index 1 to 3, these intervals are a same region of empty intervals
- and interval at index 4 is a separate region on its own
1 Answer
Reset to default 2import numpy as np
interval_start_idxs = np.array([0, 3, 3, 3, 6, 7])
interval_end_excl_idxs = np.array([2, 4, 4, 4, 7, 9])
is_region_start = np.r_[True, np.diff(interval_start_idxs) != 0]
is_region_end = np.roll(is_region_start, -1)
is_empty = (interval_start_idxs == interval_end_excl_idxs - 1)
empty_interval_starts = np.nonzero(is_region_start & is_empty)[0]
empty_interval_ends_excl = np.nonzero(is_region_end & is_empty)[0] + 1
Explanation:
is_region_start
marks the starts of all potential regions, i.e. indices where the current index differs from its predecessor- the index of the end of a potential region is right before the start of a new region, which is why we roll back all markers in
is_region_start
by one to getis_region_end
; the rollover in the roll-back from index 0 to index -1 works in our favor here: the marker, previously at index 0, which is alwaysTrue
, used to mark the start of the first potential region inis_region_start
and now marks the end of the last potential region inis_region_end
is_empty
marks all indices that are actually empty, according to your definitionempty_interval_starts
is the combination of two criteria: start of a potential region and actually being empty (sincenp.nonzero()
returns tuples, we need to extract the first element,…[0]
, to get to the actual array of indices)empty_interval_ends_excl
, likewise, is the combination of two criteria: end of a potential region and actually being empty; however, sinceempty_interval_ends_excl
should be exclusive, we need to add 1 to get the final result
At present, the results (empty_interval_starts
and empty_interval_ends_excl
) are Numpy arrays. If you prefer them as lists, as written in the question, you might want to convert them with empty_interval_starts.tolist()
and empty_interval_ends_excl.tolist()
, respectively.
本文标签: pythonHow to get start indices of regions of empty intervalsStack Overflow
版权声明:本文标题:python - How to get start indices of regions of empty intervals? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1738436984a2086732.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
start == end
. I expect a "3 to 4" range to be a range including 3 and excluding 4 as you seems to indicate just before in the sentence "start indices (included) and end indices (excluded)". Besides, can you explain more precisely how you end up to the final result for the example? – Jérôme Richard Commented Jan 24 at 8:59