admin管理员组

文章数量:1278880

I have an array of shape say [6000,3] and I want to find the mean of each [3,3] section and output an array of size [5998]?
If I can do this without looping through the array that would be fantastic!

I have an array of shape say [6000,3] and I want to find the mean of each [3,3] section and output an array of size [5998]?
If I can do this without looping through the array that would be fantastic!

Share Improve this question edited Feb 28 at 16:50 Fff 8011 bronze badges asked Feb 24 at 11:24 Polly GillPolly Gill 633 bronze badges 5
  • 1 I think the shape will be (5998, ). The same formula for convolution applies here. output_shape = input_shape - kernel_shape +1 output_shape = 6000 - 3 + 1 = 5998 – berinaniesh Commented Feb 24 at 11:49
  • 1 @berinaniesh In fact, a convolution could be used to solve the problem. – simon Commented Feb 24 at 11:50
  • @simon a convolution with a kernel of all ones? That's neat! – berinaniesh Commented Feb 24 at 11:55
  • @berinaniesh, exactly; or rather: a 3×3 kernel with 1/9 everywhere – simon Commented Feb 24 at 11:58
  • 1 @berinaniesh You would need scipy.signal.convolve2d in that case – simon Commented Feb 24 at 12:02
Add a comment  | 

5 Answers 5

Reset to default 6

Just use convolve and mean:

out = np.convolve(
    np.mean(x, axis=1),
    np.ones(3)/3, mode='valid',
)

P.S. Maybe [5998], not [5997]?

Three more options:

import numpy as np
from numpy.lib.stride_tricks import sliding_window_view
from scipy.signal import convolve2d
from timeit import Timer

a = np.random.normal(size=(6000, 3))

def conv(): return np.convolve(np.sum(a, axis=-1),
                               np.full(3, fill_value=1/9), mode="valid")
def conv2d(): return convolve2d(a, np.full((3, 3), fill_value=1/9),
                                mode="valid").ravel()
def winview(): return np.mean(sliding_window_view(np.mean(a, axis=-1),
                                                  window_shape=3), axis=-1)

assert np.allclose(conv(), conv2d())
assert np.allclose(conv(), winview())

for fct in (conv, conv2d, winview):
    print(fct.__name__, Timer(fct).timeit(1000))

The 1-d convolution option (conv()) seems to be the fastest:

conv 0.09130873999993128
conv2d 0.2098175920000358
winview 0.17042801200000213

Thanks to @Nin17 for the hints!

Given data x for example

np.random.seed(0)
x = np.random.random((6000,3))
  • Option 1: using sliding_window_view
out = np.mean(np.lib.stride_tricks.sliding_window_view(x, (3,), axis = 0), axis=(-2,-1))
  • Option 2: using torch.nn.Conv2d
import torch
h = torch.nn.Conv2d(1, 1, 3, bias=False)
h.weight.data = torch.ones((1,1, 3, 3))/9
out = h(torch.tensor(x[None,None,:,:],dtype=torch.float)).detach().numpy().squeeze()
  • Option 3: using zip
idx, wndSize = np.arange(x.shape[0]), 3
out = np.mean(list(zip(*[x[idx[k:],:] for k in range(3)])), axis = (1,2))

such that

print(f'out=\n {out}\n')
print(f'out shape =\n {out.shape}\n')

shows

out=
 [0.6415802  0.62350184 0.6179735  ... 0.4392221  0.50177455 0.57834566]

out shape =
 (5998,)

Pandas has some nice tools for working with rolling aggregation functions if you're willing to use that instead.

import numpy
import pandas as pd

np.random.seed(0)
x = np.random.random((6000, 3))
df = pd.DataFrame(x)
print(df.rolling(window=3).mean())

In this case we need to sum the values and then divide after to get the rolling mean:

print(df.sum(axis=1).rolling(window=3).sum() / 9)

Output

0            NaN
1            NaN
2       0.641580
3       0.623502
4       0.617974
          ...   
5995    0.529773
5996    0.435670
5997    0.439222
5998    0.501775
5999    0.578346
Length: 6000, dtype: float64

You can also work with a prefix sum using np.cumsum:

import numpy as np

a = np.random.normal(size=(6000, 3))

width, n_channels = 3, a.shape[1]
a_cumsum = np.cumsum(np.sum(a, axis=-1), axis=0)
result = (a_cumsum[width:] - a_cumsum[:-width]) / (width * n_channels)

But this should scale similarly as native convolution for small kernels. So I do not expected any performance improvement over the other solutions.

Maybe that helps!

本文标签: pythonHow do I perform a rolling mean across a 2D arrayStack Overflow