admin管理员组

文章数量:1389768

Good morning, I was trying to parallelize a loop with OMP in a program I created for a flood simulator with the goal of achieving a performance boost. The loop I want to parallelize is as follows:

#define accessMat( arr, exp1, exp2 ) arr[ (int)(exp1) * columns + (int)(exp2) ] 
#define PRECISION       1000000
#define FIXED(a)    ( (int)((a) * PRECISION) )

max_spillage_iter = 0.0;
        for ( row_pos=0; row_pos<rows; row_pos++ ) {
            for ( col_pos=0; col_pos<columns; col_pos++ ) {
                // If the cell has spillage
                if( accessMat( spillage_flag, row_pos, col_pos ) == 1 ) {

                    // Eliminate the spillage from the origin cell
                    accessMat( water_level, row_pos, col_pos ) -= FIXED( accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR );

                    // Compute termination condition: Maximum cell spillage during the iteration
                    if ( accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR > max_spillage_iter ) {
                        max_spillage_iter = accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR;
                    }
                    // Statistics: Record maximum cell spillage during the scenario and its time
                    if ( accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR > max_spillage_scenario ) {
                        max_spillage_scenario = accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR;
                        max_spillage_minute = minute;
                    }
                }

                // Accumulate spillage from neighbors
                for ( cell_pos= 0; cell_pos < CONTIGUOUS_CELLS; cell_pos++ ) {
                    int depths = CONTIGUOUS_CELLS;
                    accessMat( water_level, row_pos, col_pos ) += FIXED( accessMat3D( spillage_from_neigh, row_pos, col_pos, cell_pos ) / SPILLAGE_FACTOR );
                }
            }
        }


I have tried this

#pragma omp parallel for collapse(2),private(row_pos, col_pos, cell_pos),shared(water_level, minute),reduction(max:local_max_spillage_iter, local_max_spillage_scenario)
        for ( row_pos=0; row_pos<rows; row_pos++ ) {
            for ( col_pos=0; col_pos<columns; col_pos++ ) {
                // If the cell has spillage
                if( accessMat( spillage_flag, row_pos, col_pos ) == 1 ) {

                    // Eliminate the spillage from the origin cell
                    accessMat( water_level, row_pos, col_pos ) -= FIXED( accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR );

                    // Compute termination condition: Maximum cell spillage during the iteration
                    if ( accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR > max_spillage_iter ) {
                        max_spillage_iter = accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR;
                    }
                    // Statistics: Record maximum cell spillage during the scenario and its time
                    if ( accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR > max_spillage_scenario ) {
                        max_spillage_scenario = accessMat( spillage_level, row_pos, col_pos ) / SPILLAGE_FACTOR;
                        max_spillage_minute = minute;
                    }
                }

                // Accumulate spillage from neighbors
                for ( cell_pos= 0; cell_pos < CONTIGUOUS_CELLS; cell_pos++ ) {
                    int depths = CONTIGUOUS_CELLS;
                    #pragma omp atomic
                    accessMat( water_level, row_pos, col_pos ) += FIXED( accessMat3D( spillage_from_neigh, row_pos, col_pos, cell_pos ) / SPILLAGE_FACTOR );
                }
            }
        }

but the results are not the same as those I get without parallelization. I've tried every modification I can think of, but I can't seem to get it right. Any suggestions or possible errors that could be causing the inconsistent results would be appreciated.

本文标签: cloop parallelization with ompStack Overflow