A small update.
My colleague made a mistake and there is no arithmetic performance
issue. Sorry for bothering you.
Nevertheless, one can observed some differences between MPICH and
OpenMPI from 25% to 100% depending on the options we are using into our
software. Tests are lead on a single SGI node on 6 or 12 processes, and
thus, I am focused on the sm option.
So, I have two questions:
1/ does the option--mca mpool_sm_max_size=XXXX can change something (I
am wondering if the value is not too small and, as consequence, a set of
small messages is sent instead of a big one)
2/ is there a difference between --mca btl tcp,sm,self and --mca btl
self,sm,tcp (or not put any explicit mca option)?
On 12/05/2010 06:10 PM, Eugene Loh wrote:
> Mathieu Gontier wrote:
>> Dear OpenMPI users
>> I am dealing with an arithmetic problem. In fact, I have two variants
>> of my code: one in single precision, one in double precision. When I
>> compare the two executable built with MPICH, one can observed an
>> expected difference of performance: 115.7-sec in single precision
>> against 178.68-sec in double precision (+54%).
>> The thing is, when I use OpenMPI, the difference is really bigger:
>> 238.5-sec in single precision against 403.19-sec double precision
>> Our experiences have already shown OpenMPI is less efficient than
>> MPICH on Ethernet with a small number of processes. This explain the
>> differences between the first set of results with MPICH and the
>> second set with OpenMPI. (But if someone have more information about
>> that or even a solution, I am of course interested.)
>> But, using OpenMPI increases the difference between the two
>> arithmetic. Is it the accentuation of the OpenMPI+Ethernet loss of
>> performance, is it another issue into OpenMPI or is there any option
>> a can use?
> It is also unusual that the performance difference between MPICH and
> OMPI is so large. You say that OMPI is slower than MPICH even at
> small process counts. Can you confirm that this is because MPI calls
> are slower? Some of the biggest performance differences I've seen
> between MPI implementations had nothing to do with the performance of
> MPI calls at all. It had to do with process binding or other factors
> that impacted the computational (non-MPI) performance of the code.
> The performance of MPI calls was basically irrelevant.
> In this particular case, I'm not convinced since neither OMPI nor
> MPICH binds processes by default.
> Still, can you do some basic performance profiling to confirm what
> aspect of your application is consuming so much time? Is it a
> particular MPI call? If your application is spending almost all of
> its time in MPI calls, do you have some way of judging whether the
> faster performance is acceptable? That is, is 238 secs acceptable and
> 403 secs slow? Or, are both timings unacceptable -- e.g., the code
> "should" be running in about 30 secs.
> users mailing list