Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Asynchronous behaviour of MPI Collectives
From: Igor Kozin (i.n.kozin_at_[hidden])
Date: 2009-01-23 09:27:22


Hi Gabriele,
it might be that your message size is too large for available memory per
node.
I had a problem with IMB when I was not able to run to completion Alltoall
on N=128, ppn=8 on our cluster with 16 GB per node. You'd think 16 GB is
quite a lot but when you do the maths:
2* 4 MB * 128 procs * 8 procs/node = 8 GB/node plus you need to double
because of buffering. I was told by Mellanox (our cards are ConnectX cards)
that they introduced XRC in OFED 1.3 in addition to Share Receive Queue
which should reduce memory foot print but I have not tested this yet.
HTH,
Igor
2009/1/23 Gabriele Fatigati <g.fatigati_at_[hidden]>

> Hi Igor,
> My message size is 4096kb and i have 4 procs per core.
> There isn't any difference using different algorithms..
>
> 2009/1/23 Igor Kozin <i.n.kozin_at_[hidden]>:
> > what is your message size and the number of cores per node?
> > is there any difference using different algorithms?
> >
> > 2009/1/23 Gabriele Fatigati <g.fatigati_at_[hidden]>
> >>
> >> Hi Jeff,
> >> i would like to understand why, if i run over 512 procs or more, my
> >> code stops over mpi collective, also with little send buffer. All
> >> processors are locked into call, doing nothing. But, if i add
> >> MPI_Barrier after MPI collective, it works! I run over Infiniband
> >> net.
> >>
> >> I know many people with this strange problem, i think there is a
> >> strange interaction between Infiniband and OpenMPI that causes it.
> >>
> >>
> >>
> >> 2009/1/23 Jeff Squyres <jsquyres_at_[hidden]>:
> >> > On Jan 23, 2009, at 6:32 AM, Gabriele Fatigati wrote:
> >> >
> >> >> I've noted that OpenMPI has an asynchronous behaviour in the
> collective
> >> >> calls.
> >> >> The processors, doesn't wait that other procs arrives in the call.
> >> >
> >> > That is correct.
> >> >
> >> >> This behaviour sometimes can cause some problems with a lot of
> >> >> processors in the jobs.
> >> >
> >> > Can you describe what exactly you mean? The MPI spec specifically
> >> > allows
> >> > this behavior; OMPI made specific design choices and optimizations to
> >> > support this behavior. FWIW, I'd be pretty surprised if any optimized
> >> > MPI
> >> > implementation defaults to fully synchronous collective operations.
> >> >
> >> >> Is there an OpenMPI parameter to lock all process in the collective
> >> >> call until is finished? Otherwise i have to insert many MPI_Barrier
> >> >> in my code and it is very tedious and strange..
> >> >
> >> > As you have notes, MPI_Barrier is the *only* collective operation that
> >> > MPI
> >> > guarantees to have any synchronization properties (and it's a fairly
> >> > weak
> >> > guarantee at that; no process will exit the barrier until every
> process
> >> > has
> >> > entered the barrier -- but there's no guarantee that all processes
> leave
> >> > the
> >> > barrier at the same time).
> >> >
> >> > Why do you need your processes to exit collective operations at the
> same
> >> > time?
> >> >
> >> > --
> >> > Jeff Squyres
> >> > Cisco Systems
> >> >
> >> > _______________________________________________
> >> > users mailing list
> >> > users_at_[hidden]
> >> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Ing. Gabriele Fatigati
> >>
> >> Parallel programmer
> >>
> >> CINECA Systems & Tecnologies Department
> >>
> >> Supercomputing Group
> >>
> >> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> >>
> >> www.cineca.it Tel: +39 051 6171722
> >>
> >> g.fatigati [AT] cineca.it
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
>
> --
> Ing. Gabriele Fatigati
>
> Parallel programmer
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati [AT] cineca.it
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>