admin管理员组

文章数量:1415684

Suppose that A is a one-dimensional NumPy array and s, n_y, n_x, blk_y, and blk_x are integers > 0 and s divides both blk_y and blk_x. The number of elements in A is equal to both n*s*s and ny*blk_y*n_x*blk_x. Write C code that fuses the following transpositions into one pass over the array:

A = A.reshape(n, s, s).transpose(0, 2, 1)
A = A.reshape(n_y, n_x, blk_y, blk_x).transpose(0, 2, 1, 3)
A = A.reshape(n_y * blk_y, n_x * blk_x))

It is straight-forward to convert the code to C if we are allowed to perform multiple passes over the array, but I don't think we have to. If we have a long chain of reshape/transposes,

A.reshape(...).transpose(...).reshape(...).transpose(...).reshape(...).transpose(...)

can we create a formula for how to reorder the array in one pass? I think we can since we are just permuting the array based on some pattern. Transpositions does not have to be in place. What matters is that the source array is only read once.

本文标签: C code for fusing multiple array transpositions into one passStack Overflow