admin管理员组

文章数量:1292223

I'm trying to replace for example [1,2,3,4,5,6] and replace the 4s and 6s with 2,2 and 2,3 respectively thus returning [1,2,3,2,2,5,2,3] Is there an efficient way to do this?

I tried with np.select to little results and couldn't find a lot more

I'm trying to replace for example [1,2,3,4,5,6] and replace the 4s and 6s with 2,2 and 2,3 respectively thus returning [1,2,3,2,2,5,2,3] Is there an efficient way to do this?

I tried with np.select to little results and couldn't find a lot more

Share Improve this question edited Feb 13 at 11:05 ThomasIsCoding 103k9 gold badges36 silver badges101 bronze badges asked Feb 13 at 10:25 Matias Bay RodriguezMatias Bay Rodriguez 171 silver badge1 bronze badge 2
  • 4 Efficiency is a bit subjective. How many items do you have? How many unique items? How many items to replace? Is the replacement size always 2 items? How many groups of consecutive non-match / match do you expect? – mozway Commented Feb 13 at 10:34
  • Do you always replace with 2 elements? Can it be more than 3? – ken Commented Feb 13 at 12:30
Add a comment  | 

8 Answers 8

Reset to default 2

Using a numpy module, if-else will be beneficial for outputting a new array using concatenate: concatenate doc

for element in original_array:
    if element == 4:
        expanded_list.append(np.array([2, 2])) 
    elif element == 6:
        expanded_list.append(np.array([2, 3])) 
    else:
        expanded_list.append(np.array([element]))

new_array = np.concatenate(expanded_list)

There is no straightforward way to achieve this in pure numpy.

Assuming the replacements are always of the same size you could identify the positions of the replacements, extend the array with repeat and assign the new values with a mask:

a = np.array([1, 2, 3, 4, 5, 6, 7, 6, 8, 4])

d = {4: [20, 20], 6: [21, 31]}
m = np.isin(a, list(d))
# array([False, False, False,  True, False,  True, False,  True, False,  True])

uniq, idx = np.unique(a[m], return_inverse=True)
# array([4, 6]), array([0, 1, 1, 0])

rep = np.ones_like(a)
rep[m] = 2
# array([1, 1, 1, 2, 1, 2, 1, 2, 1, 2])

vals = np.array([d[x] for x in uniq])[idx].ravel()
# array([20, 20, 21, 31, 21, 31, 20, 20])

out = np.repeat(a, rep)
out[np.repeat(m, rep)] = vals

NB. you could generalize this approach to handle a different number of replacement values per original value, but this would become a bit more complex, and probably no longer efficient.

Output (slightly different example):

array([ 1,  2,  3, 20, 20,  5, 21, 31,  7, 21, 31,  8, 20, 20])

For a pure python approach, which is easier to understand and can handle an arbitrary number of items as replacement (but potentially slower):

from itertools import chain

a = np.array([1, 2, 3, 4, 5, 6, 7, 6, 8, 4])

d = {4: [20, 20], 6: [21, 31]}

out = out = np.fromiter(chain.from_iterable(d.get(x, (x,)) for x in a),
                        dtype=a.dtype)

Replacing one element with two will change the length of that array (will mess up the indexing) so you can't do it with np.select, so, I recommend making a dict with the numbers you want to change as keys. in the example you provided it will be replacement = {4: [2, 2], 6: [2, 3]} and then iterate over the list and change the numbers with that list in the dict, you will have a list of lists [1, 2, 3, [2, 2], 5, [2, 3]] so then you can flatten this list by using np.concatenate

so the final result will be:

arr = np.array([1, 2, 3, 4, 5, 6])
replacement = {4: [2, 2], 6: [2, 3]}
result = np.concatenate([replacement.get(x, [x]) for x in arr])

it will output [1 2 3 2 2 5 2 3]

If you don't mind the efficiency issue, you can play the trick // within list comprehension like below

import itertools
x = [1, 2, 3, 4, 5, 6]
list(itertools.chain(*[[2, v//2] if v in [4,6] else [v] for v in x]))

which gives

[1, 2, 3, 2, 2, 5, 2, 3]

Another approach is to leverage numba:

import numpy as np
import numba as nb
from numba import types
from numba.typed import Dict

a = np.array([1, 2, 3, 4, 5, 6], dtype=np.int64)

# Create a numba-typed dictionary
d_typed = Dict.empty(
    key_type=types.int64,
    value_type=types.Array(types.int64, 1, 'C')
)

d_typed[4] = np.array([2, 2], dtype=np.int64)
d_typed[6] = np.array([2, 3], dtype=np.int64)

@nb.njit()
def replace_nums(b, d):
    x = []
    for i in range(len(b)):
        key = b[i]
        if key in d:
            x.append(d[key])
        else:
            x.append(np.array([key], dtype=np.int64)) 
    return x


np.concatenate(replace_nums(a, d_typed))

The steps are:

  • It initializes a numba-typed dictionary d_typed with integer keys and values that are arrays of integers.

  • The dictionary d_typed is populated with arrays for specific keys.

  • A function replace_nums is defined using the njit decorator from Numba for just-in-time compilation.

  • The function iterates through each element in b, checking if the element exists as a key in d. If it does, the corresponding array from d is appended to the list x; otherwise, a new array containing the element is appended.

  • Finally, the function returns the list x, and np.concatenate is used to flatten the list of arrays into a single array.

NB: Preallocating the final array leads to an extra performance improvement, although the above code is already fast.

Output:

array([1, 2, 3, 2, 2, 5, 2, 3])

There are easy ways to do this with list comprehension:

def replace_vals(list_to_use, replacements):
    return [element for x in list_to_use for element in (replacements[x] if x in replacements else [x])]

# example
list_to_use = [1, 2, 3, 4, 5, 6]
replacements = {4: [2, 2], 6: [2, 3]}

result = replace_vals(list_to_use, replacements)
print(result)  # [1, 2, 3, 2, 2, 5, 2, 3]

With this idea you can extend your replacements dict however you need to update your needs.

While all the answers work the problem you are having seems to prime decomposition. So ideally to make this more general you have some function that extracts prime factors.

One example could be:

import numpy as np

def prime_factors(n):
    if n == 0 or n == 1:
        return [n]

    factors = []
    while n % 2 == 0:  # Factor out 2s
        factors.append(2)
    n //= 2

    for i in range(3, int(np.sqrt(n)) + 1, 2): # Factor out odd numbers from 3 onwards
        while n % i == 0:
            factors.append(i)
            n //= i

    if n > 2: # If it is a prime
        factors.append(n)
    return factors

This is by no means optimized but could be a nice starting point. If efficiency is a concern maybe prime sieves help.

Then the result of a list ranging from [1,6] would be:

print(np.array(sum([prime_factors(n) for n in range(1,7)],[]))  # [1 2 3 2 2 5 2 3]

Here's a way using only numpy functions, and no explicit iteration. However functions like delete and insert make new arrays of the right size, and copy values to the respective slots. So I doubt if it has any speed advantage over the more python list like answers.

In [237]: x = np.array([1,2,3,4,5,6])

Find the elements that need to be replaced:

In [238]: mask = np.nonzero((x[:,None]==[4,6]).any(axis=1))[0]; mask
Out[238]: array([3, 5], dtype=int64)

Delete those elements, and modify the mask to account for that change:

In [239]: mask1=mask-np.arange(len(mask)); mask1
Out[239]: array([3, 4], dtype=int64)

In [240]: x1=np.delete(x,mask);x1
Out[240]: array([1, 2, 3, 5])

Now we can use insert:

In [241]: np.insert(x1,mask1.repeat(2), [2,2,2,3])
Out[241]: array([1, 2, 3, 2, 2, 5, 2, 3])

For now I'm assuming one value is replaced by 2. If we want to replace the 6 with [1,2,3], we could use:

In [243]: np.insert(x1,mask1.repeat([2,3]), [2,2,1,2,3])
Out[243]: array([1, 2, 3, 2, 2, 5, 1, 2, 3])

本文标签: in pythonhow can I replace a number from a numpy array with 2 othersStack Overflow