Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
From: Gus Correa (gus_at_[hidden])
Date: 2009-06-23 16:18:15


Hi Jim

Jim Kress wrote:
> Are you speaking of the configure for the application or for OpenMPI?
>

I am speaking of OpenMPI configure.
Scott Hamilton also mentioned this,
when he answered you in the Rocks mailing list.

> I have no control over the application since it is provided as an executable
> only.
>

I understand that ORCA is a black box or a black killer whale,
but if your OpenMPI was not built with IB,
there is no hope that ORCA will use IB.
Did you do ompi_info -config?

Some of my builds missed libnuma, others missed libtorque,
eventually I got it right.
Then the OpenMPI team changed configure
(somewhere along the 1.3 series), so I had to change again.

If the libraries aren't in standard places (/usr/lib /usr/lib64),
and the includes also (/usr/include) you need to tell configure where
they are. See the OpenMPI README file and FAQ.

My $0.02.
Gus Correa

PS - BTW, what is your advice for a fellow trying to run the
computational chemistry software from Schroedinger.com?
I know nothing of comput-chem, an area where you are the pro.
This question came on the Beowulf list, and apparently the darn piece of
software requires MPICH-1, and only executables are provided.
I know (You told me!) that MPICH-1 fails miserably with those
p4 errors on later Linux kernels, which is what the poor guy
is getting.
If he at least had the object files he could try to link to MPICH2,
but apparently he only has executables (statically linked to MPICH-1,
I suppose).

---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

> Jim
>
>> -----Original Message-----
>> From: users-bounces_at_[hidden]
>> [mailto:users-bounces_at_[hidden]] On Behalf Of Gus Correa
>> Sent: Tuesday, June 23, 2009 2:01 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] 50% performance reduction due to
>> OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead
>> of using Infiniband
>>
>> Hi Jim, list
>>
>> Have you checked if configure caught your IB libraries properly?
>> IIRR there has been some changes since 1.2.8 on how configure
>> searches for libraries (e.g. finding libnuma was a problem,
>> now fixed).
>> Chances are that if you used some old script or command line
>> to run configure, it may not have worked as you expected.
>>
>> Check the output of ompi_info -config.
>> It should show -lrdmacm -libverbs, otherwise it skipped IB.
>> In this case you can reconfigure, pointing to the IB library location.
>>
>> If you have a log of your configure step you can also search
>> it for openib, libverbs, etc, to see if it did what you expected.
>>
>> I hope this helps,
>> Gus Correa
>> ---------------------------------------------------------------------
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> ---------------------------------------------------------------------
>>
>>
>> Pavel Shamis (Pasha) wrote:
>>> Jim,
>>> Can you please share with us you mca conf file.
>>>
>>> Pasha.
>>> Jim Kress ORG wrote:
>>>> For the app I am using, ORCA (a Quantum Chemistry
>> program), when it
>>>> was compiled using openMPI 1.2.8 and run under 1.2.8 with the
>>>> following in the openmpi-mca-params.conf file:
>>>>
>>>> btl=self,openib
>>>>
>>>> the app ran fine with no traffic over my Ethernet network and all
>>>> traffic over my Infiniband network.
>>>>
>>>> However, now that ORCA has been recompiled with openMPI v1.3.2 and
>>>> run under 1.3.2 (using the same openmpi-mca-params.conf file), the
>>>> performance has been reduced by 50% and all the MPI
>> traffic is going
>>>> over the Ethernet network.
>>>>
>>>> As a matter of fact, the openMPI v1.3.2 performance now
>> looks exactly
>>>> like the performance I get if I use MPICH 1.2.7.
>>>>
>>>> Anyone have any ideas:
>>>>
>>>> 1) How could this have happened?
>>>>
>>>> 2) How can I fix it?
>>>>
>>>> a 50% reduction in performance is just not acceptable. Ideas/
>>>> suggestions would be appreciated.
>>>>
>>>> Jim
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users