Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?
From: Richard Treumann (treumann_at_[hidden])
Date: 2010-08-23 22:43:43

Network saturation could produce arbitrary long delays the total data load
we are talking about is really small. It is the responsibility of an MPI
library to do one of the following:

1) Use a reliable message protocol for each message (e.g. Infiniband RC or
2) detect lost packets and retransmit them if the protocol is un-reliable
(E.G. Infiniband UD or UDP/IP)

It is the responsibility of an MPI library to manage any MPI or system
buffers to prevent over run. That is why I mention that 1/2 MB messages
would use rendezvous protocol. The send side would push a descriptor
(called an envelop) to the receive side. The receive side would push back
an OK_to_send once a matching receive was posted. The 1/2 MB message data
would not begin to flow across the network until the receive buffer was

It is also the responsibility of an MPI library to detect when MPI level
messages have become undeliverable and fail the job.

Bugs are always a possibility but unless there is something very unusual
about the cluster and interconnect or this is an unstable version of MPI,
it seems very unlikely this use of MPI_Bcast with so few tasks and only a
1/2 MB message would trip on one. 80 tasks is a very small number in
modern parallel computing. Thousands of tasks involved in an MPI
collective has become pretty standard.

Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363

users-bounces_at_[hidden] wrote on 08/23/2010 09:39:29 PM:

> I have had a similar load related problem with Bcast. I don't know
> what caused it though. With this one, what about the possibility of
> a buffer overrun or network saturation?