c++ - Parallelization of nested loops with OpenMP -
i trying parallelize following loop in code openmp
double pottemp,pot2body; pot2body=0.0; pottemp=0.0; #pragma omp parallel reduction(+:pot2body) private(pottemp) schedule(dynamic) for(int i=0;i<nc2;i++) { pottemp=ener2body[i]->calculatepot(ener2body[i]->m_mols); pot2body+=pottemp; }
for function 'calculatepot', important loop inside function has been parallelized openmp
cenergymulti::calculatepot(vector<cmolecule*> m_mols) { ... #pragma omp parallel reduction(+:dev) schedule(dynamic) (int = 0; < i_max; i++) { ... } }
so seems parallelization involves nested loops. when removed parallelization of outmost loop, seems program runs faster 1 outmost loop parallelized. test performed on 8 cores.
i think low efficiency of parallelization might related nested loops. suggests me using 'collapse' while parallelizing outmost loop. however, since there still between outmost loop , inner loop, said 'collapse' cannot used under circumstance. there other ways try make parllelization more efficient while still using openmp?
thanks lot.
if i_max independent of in outerloop can try fusing loops (essentially collapse). it's often gives me small boost. prefer fusing loops "by hand" rather openmp because visual studio supports openmp 2.0 not have collapse , want code work on windows , linux.
#pragma omp parallel reduction(+:pot2body) schedule(dynamic) for(int n=0; n<(nc2*i_max); n++) { int = n/i_max; //i outer loop int j = n%i_max; //i inner loop double pottmp_j = ... pot2body += pottmp_j; }
if i_max depends on j won't work. in case follow grizzly's advice. 1 more thing can try. openmp has overhead. if i_max small using openmp slower. if add if clause @ end of pragma openmp run if statement true. this:
const int threshold = ... // smallest value openmp gives speedup. #pragma omp parallel reduction(+:dev) schedule(dynamic) if(i_max > threshold)
Comments
Post a Comment