Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI scaling > 512 cores
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-06-04 16:34:28


One other parameter that I neglected to mention (and Scott pointed out
to me is *not* documented in the FAQ) is the mpi_preconnect_oob MCA
param.

This parameter will cause all the OOB connections to be created during
MPI_INIT, and *may* help such kind of issues. You *do* need to have
enough fd's available per process to allow this to happen at scale, of
course. I'll try to add this information to the FAQ by the end of
this week.

This kind of thing is much better in the v1.3 series -- the linear TCP
wireup is no longer necessary (e.g., each MPI process only opens 1 TCP
socket: to the daemon on its host, etc.).

On Jun 4, 2008, at 4:14 PM, Åke Sandgren wrote:

> On Wed, 2008-06-04 at 11:43 -0700, Scott Shaw wrote:
>> Hi, I was wondering if anyone had any comments with regarding to my
>> posting of questions. Am I off base with my questions or is this the
>> wrong forum for these types of questions?
>>
>>>
>>> Hi, I hope this is the right forum for my questions. I am running
>> into a
>>> problem when scaling >512 cores on a infiniband cluster which has
>> 14,336
>>> cores. I am new to openmpi and trying to figure out the right -mca
>> options
>
> I don't have any real answerr to you question except that i have had
> no
> problems running HPL on our 672 node dual quad core = 5376 cores with
> infiniband.
> We use verbs.
> I wouldn't touch the oob parameters since it uses tcp over ethernet to
> setup the environment.
>
> --
> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
> Internet: ake_at_[hidden] Phone: +46 90 7866134 Fax: +46 90 7866126
> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems