[From Eugene Loh:]
> OpenMPI - 25 m 39 s.
>> MPICH2 - 15 m 53 s.
> With regards to your issue, do you have any indication when you get that
> 25m39s timing if there is a grotesque amount of time being spent in MPI
> calls? Or, is the slowdown due to non-MPI portions?
Just to add my two cents: if this job *can* be run on less than 8
processors (ideally, even on just 1), then I'd recommend doing so. That is,
run it with OpenMPI and with MPICH2 on 1, 2 and 4 processors as well. If
the single-processor jobs still give vastly different timings, then perhaps
Eugene is on the right track and it comes down to various computational
optimizations and not so much the message-passing that's make a difference.
Timings from 2 and 4 process runs might be interesting as well to see how
this difference changes with process counts.
I've seen differences between various MPI libraries before, but nothing
quite this severe either. If I get the time, maybe I'll try to set up
Gromacs tonight -- I've got both MPICH2 and OpenMPI installed here and can
try to duplicate the runs. Sangamesh, is this a standard benchmark case
that anyone can download and run?
Yale Engineering HPC