python - How to get start indices of regions of empty intervals? - Stack Overflow

IT技术

更新时间：2025-02-021

admin管理员组
文章数量:1192555

I have sorted start indices (included) and end indices (excluded) of intervals (obtained by using seachsorted), for instance:

import numpy as np

# Both arrays are of same size, and sorted.
# Size of arrays is number of intervals.
# Intervals do not overlap.

# interval indices:                0  1  2  3  4  5
interval_start_idxs    = np.array([0, 3, 3, 3, 6, 7])
interval_end_excl_idxs = np.array([2, 4, 4, 4, 7, 9])

An empty interval is identified when

interval_start_idxs[interval_idx] == interval_end_excl_idxs[interval_idx]-1

I would like to identify the starts and ends of each region where intervals are empty. A region is made with one or several intervals sharing the same start indices and end excluded indices.

With previous data, expected result would then be:

empty_interval_starts     = [1, 4] # start is included
empty_intervals_ends_excl = [4, 5] # end is excluded

This result is to be understood as:

intervals from index 1 to 3, these intervals are a same region of empty intervals
and interval at index 4 is a separate region on its own

I have sorted start indices (included) and end indices (excluded) of intervals (obtained by using seachsorted), for instance:

import numpy as np

# Both arrays are of same size, and sorted.
# Size of arrays is number of intervals.
# Intervals do not overlap.

# interval indices:                0  1  2  3  4  5
interval_start_idxs    = np.array([0, 3, 3, 3, 6, 7])
interval_end_excl_idxs = np.array([2, 4, 4, 4, 7, 9])

An empty interval is identified when

interval_start_idxs[interval_idx] == interval_end_excl_idxs[interval_idx]-1

I would like to identify the starts and ends of each region where intervals are empty. A region is made with one or several intervals sharing the same start indices and end excluded indices.

With previous data, expected result would then be:

empty_interval_starts     = [1, 4] # start is included
empty_intervals_ends_excl = [4, 5] # end is excluded

This result is to be understood as:

intervals from index 1 to 3, these intervals are a same region of empty intervals
and interval at index 4 is a separate region on its own

Share Improve this question edited Jan 24 at 12:57 asked Jan 24 at 8:34 pierre_j 9832 gold badges15 silver badges33 bronze badges

1 Considering the start index is included and the other is excluded, this is a weird definition of "empty" intervals. It is usually start == end. I expect a "3 to 4" range to be a range including 3 and excluding 4 as you seems to indicate just before in the sentence "start indices (included) and end indices (excluded)". Besides, can you explain more precisely how you end up to the final result for the example? – Jérôme Richard Commented Jan 24 at 8:59
Hi Jérôme Richard, I am not sure to understand your question. The final result is the one I am looking for. Because intervals 1, 2, 3 are empty, and they share the same start (included & end (excluded), they are a same region, noted in the expected results from indices 1 to 4 (4 is excluded). If there is a single empty interval (like interval at index 4), then it is a region in itself and its end (excluded as well) is 5. – pierre_j Commented Jan 24 at 12:53
1 Jérôme Richard, I have tried to clarify this in the main question: - identifying clearly what interval indices are in the input data, and adding explanations about the results. – pierre_j Commented Jan 24 at 12:59

Add a comment |

1 Answer 1

Sorted by: Reset to default 2

import numpy as np

interval_start_idxs    = np.array([0, 3, 3, 3, 6, 7])
interval_end_excl_idxs = np.array([2, 4, 4, 4, 7, 9])

is_region_start = np.r_[True, np.diff(interval_start_idxs) != 0]
is_region_end = np.roll(is_region_start, -1)
is_empty = (interval_start_idxs == interval_end_excl_idxs - 1)

empty_interval_starts = np.nonzero(is_region_start & is_empty)[0]
empty_interval_ends_excl = np.nonzero(is_region_end & is_empty)[0] + 1

Explanation:

is_region_start marks the starts of all potential regions, i.e. indices where the current index differs from its predecessor
the index of the end of a potential region is right before the start of a new region, which is why we roll back all markers in is_region_start by one to get is_region_end; the rollover in the roll-back from index 0 to index -1 works in our favor here: the marker, previously at index 0, which is always True, used to mark the start of the first potential region in is_region_start and now marks the end of the last potential region in is_region_end
is_empty marks all indices that are actually empty, according to your definition
empty_interval_starts is the combination of two criteria: start of a potential region and actually being empty (since np.nonzero() returns tuples, we need to extract the first element, …[0], to get to the actual array of indices)
empty_interval_ends_excl, likewise, is the combination of two criteria: end of a potential region and actually being empty; however, since empty_interval_ends_excl should be exclusive, we need to add 1 to get the final result

At present, the results (empty_interval_starts and empty_interval_ends_excl) are Numpy arrays. If you prefer them as lists, as written in the question, you might want to convert them with empty_interval_starts.tolist() and empty_interval_ends_excl.tolist(), respectively.

本文标签： pythonHow to get start indices of regions of empty intervalsStack Overflow

版权声明：本文标题：python - How to get start indices of regions of empty intervals? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1738436984a2086732.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - How to get start indices of regions of empty intervals? - Stack Overflow

1 Answer 1

更多相关文章