Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Performance question about OpenMPI and MVAPICH2 on IB
From: Craig Tierney (craig.tierney_at_[hidden])
Date: 2009-08-07 09:14:22


neeraj_at_[hidden] wrote:
> Hi Craig,
>
> How was the nodefile selected for execution? Whether it was
> provided by scheduler say LSF/SGE/PBS or you manually gave it?
> With WRF, we observed giving sequential nodes (Blades which are in the
> same order as in enclosure) gave us some performance benefit.
>
> Regards
>

I figured this might be the case. Right now the batch system
is giving the nodes to the applciation. They are not sorted,
and I have considered doing that. I have also launched numerous
cases of one problems size, and I don't get that much variation
in run time, not to explain the differences in MPI stack.

Craig

> Neeraj Chourasia (MTS)
> Computational Research Laboratories Ltd.
> (A wholly Owned Subsidiary of TATA SONS Ltd)
> B-101, ICC Trade Towers, Senapati Bapat Road
> Pune 411016 (Mah) INDIA
> (O) +91-20-6620 9863 (Fax) +91-20-6620 9862
> M: +91.9225520634
>
>
>
> *Craig Tierney <Craig.Tierney_at_[hidden]>*
> Sent by: users-bounces_at_[hidden]
>
> 08/07/2009 04:43 AM
> Please respond to
> Open MPI Users <users_at_[hidden]>
>
>
>
> To
> Open MPI Users <users_at_[hidden]>
> cc
>
> Subject
> Re: [OMPI users] Performance question about OpenMPI and MVAPICH2 on
> IB
>
>
>
>
>
>
>
>
> Gus Correa wrote:
> > Hi Craig, list
> >
> > I suppose WRF uses MPI collective calls (MPI_Reduce,
> > MPI_Bcast, MPI_Alltoall etc),
> > just like the climate models we run here do.
> > A recursive grep on the source code will tell.
> >
>
> I will check this out. I am not the WRF expert, but
> I was under the impression that most weather models are
> nearest neighbor communications, not collectives.
>
>
> > If that is the case, you may need to tune the collectives dynamically.
> > We are experimenting with tuned collectives here also.
> >
> > Specifically, we had a scaling problem with the MITgcm
> > (also running on an IB cluster)
> > that is probably due to collectives.
> > Similar problems were reported on this list before,
> > with computational chemistry software.
> > See these threads:
> > http://www.open-mpi.org/community/lists/users/2009/07/10045.php
> > http://www.open-mpi.org/community/lists/users/2009/05/9419.php
> >
> > If WRF outputs timing information, particularly the time spent on MPI
> > routines, you may also want to compare how the OpenMPI and
> > MVAPICH versions fare w.r.t. MPI collectives.
> >
> > I hope this helps.
> >
>
> I will look into this. Thanks for the ideas.
>
> Craig
>
>
>
> > Gus Correa
> > ---------------------------------------------------------------------
> > Gustavo Correa
> > Lamont-Doherty Earth Observatory - Columbia University
> > Palisades, NY, 10964-8000 - USA
> > ---------------------------------------------------------------------
> >
> >
> >
> > Craig Tierney wrote:
> >> I am running openmpi-1.3.3 on my cluster which is using
> >> OFED-1.4.1 for Infiniband support. I am comparing performance
> >> between this version of OpenMPI and Mvapich2, and seeing a
> >> very large difference in performance.
> >>
> >> The code I am testing is WRF v3.0.1. I am running the
> >> 12km benchmark.
> >>
> >> The two builds are the exact same codes and configuration
> >> files. All I did different was use modules to switch versions
> >> of MPI, and recompiled the code.
> >>
> >> Performance:
> >>
> >> Cores Mvapich2 Openmpi
> >> ---------------------------
> >> 8 17.3 13.9
> >> 16 31.7 25.9
> >> 32 62.9 51.6
> >> 64 110.8 92.8
> >> 128 219.2 189.4
> >> 256 384.5 317.8
> >> 512 687.2 516.7
> >>
> >> The performance number is GFlops (so larger is better).
> >>
> >> I am calling openmpi as:
> >>
> >> /opt/openmpi/1.3.3-intel/bin/mpirun --mca plm_rsh_disable_qrsh 1
> >> --mca btl openib,sm,self \
> >> -machinefile /tmp/6026489.1.qntest.q/machines -x LD_LIBRARY_PATH -np
> >> $NSLOTS /home/ctierney/bin/noaa_affinity ./wrf.exe
> >>
> >> So,
> >>
> >> Is this expected? Are some common sense optimizations to use?
> >> Is there a way to verify that I am really using the IB? When
> >> I try:
> >>
> >> -mca bta ^tcp,openib,sm,self
> >>
> >> I get the errors:
> >>
> --------------------------------------------------------------------------
> >>
> >> No available btl components were found!
> >>
> >> This means that there are no components of this type installed on your
> >> system or all the components reported that they could not be used.
> >>
> >> This is a fatal error; your MPI process is likely to abort. Check the
> >> output of the "ompi_info" command and ensure that components of this
> >> type are available on your system. You may also wish to check the
> >> value of the "component_path" MCA parameter and ensure that it has at
> >> least one directory that contains valid MCA components.
> >>
> --------------------------------------------------------------------------
> >>
> >>
> >> But ompi_info is telling me that I have openib support:
> >>
> >> MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.3)
> >>
> >> Note, I did rebuild OFED and put it in a different directory
> >> and did not rebuild OpenMPI. However, since ompi_info isn't
> >> complaining and the libraries are available, I am thinking that
> >> is isn't a problem. I could be wrong.
> >>
> >> Thanks,
> >> Craig
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
> --
> Craig Tierney (craig.tierney_at_[hidden])
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> =====-----=====-----===== Notice: The information contained in this
> e-mail message and/or attachments to it may contain confidential or
> privileged information. If you are not the intended recipient, any
> dissemination, use, review, distribution, printing or copying of the
> information contained in this e-mail message and/or attachments to it
> are strictly prohibited. If you have received this communication in
> error, please notify us by reply e-mail or telephone and immediately and
> permanently delete the message and any attachments. Internet
> communications cannot be guaranteed to be timely, secure, error or
> virus-free. The sender does not accept liability for any errors or
> omissions.Thank you =====-----=====-----=====
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users