Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with openmpi and infiniband
From: Tim Mattox (timattox_at_[hidden])
Date: 2008-12-24 22:07:44


For your runs with Open MPI over InfiniBand, try using openib,sm,self
for the BTL setting, so that shared memory communications are used
within a node. It would give us another datapoint to help diagnose
the problem. As for other things we would need to help diagnose the
problem, please follow the advice on this FAQ entry, and the help page:
http://www.open-mpi.org/faq/?category=openfabrics#ofa-troubleshoot
http://www.open-mpi.org/community/help/

On Wed, Dec 24, 2008 at 5:55 AM, Biagio Lucini <B.Lucini_at_[hidden]> wrote:
> Pavel Shamis (Pasha) wrote:
>>
>> Biagio Lucini wrote:
>>>
>>> Hello,
>>>
>>> I am new to this list, where I hope to find a solution for a problem
>>> that I have been having for quite a longtime.
>>>
>>> I run various versions of openmpi (from 1.1.2 to 1.2.8) on a cluster
>>> with Infiniband interconnects that I use and administer at the same
>>> time. The openfabric stac is OFED-1.2.5, the compilers gcc 4.2 and
>>> Intel. The queue manager is SGE 6.0u8.
>>
>> Do you use OpenMPI version that is included in OFED ? Did you was able
>> to run basic OFED/OMPI tests/benchmarks between two nodes ?
>>
>
> Hi,
>
> yes to both questions: the OMPI version is the one that comes with OFED
> (1.1.2-1) and the basic tests run fine. For instance, IMB-MPI1 (which is
> more than basic, as far as I can see) reports for the last test:
>
> #---------------------------------------------------
> # Benchmarking Barrier
> # #processes = 6
> #---------------------------------------------------
> #repetitions t_min[usec] t_max[usec] t_avg[usec]
> 1000 22.93 22.95 22.94
>
>
> for the openib,self btl (6 processes, all processes on different nodes)
> and
>
> #---------------------------------------------------
> # Benchmarking Barrier
> # #processes = 6
> #---------------------------------------------------
> #repetitions t_min[usec] t_max[usec] t_avg[usec]
> 1000 191.30 191.42 191.34
>
> for the tcp,self btl (same test)
>
> No anomalies for other tests (ping-pong, all-to-all etc.)
>
> Thanks,
> Biagio
>
>
> --
> =========================================================
>
> Dr. Biagio Lucini
> Department of Physics, Swansea University
> Singleton Park, SA2 8PP Swansea (UK)
> Tel. +44 (0)1792 602284
>
> =========================================================
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmattox_at_[hidden] || timattox_at_[hidden]
    I'm a bright... http://www.the-brights.net/