Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Li-Ta Lo (ollie_at_[hidden])
Date: 2007-08-30 11:07:44


On Thu, 2007-08-30 at 10:26 -0400, Rolf.Vandevaart_at_[hidden] wrote:
> Li-Ta Lo wrote:
>
> >On Tue, 2007-08-28 at 10:12 -0600, Brian Barrett wrote:
> >
> >
> >>On Aug 28, 2007, at 9:05 AM, Li-Ta Lo wrote:
> >>
> >>
> >>
> >>>On Mon, 2007-08-27 at 15:10 -0400, Rolf vandeVaart wrote:
> >>>
> >>>
> >>>>We are running into a problem when running on one of our larger SMPs
> >>>>using the latest Open MPI v1.2 branch. We are trying to run a job
> >>>>with np=128 within a single node. We are seeing the following error:
> >>>>
> >>>>"SM failed to send message due to shortage of shared memory."
> >>>>
> >>>>We then increased the allowable maximum size of the shared segment to
> >>>>2Gigabytes-1 which is the maximum allowed on 32-bit application. We
> >>>>used the mca parameter to increase it as shown here.
> >>>>
> >>>>-mca mpool_sm_max_size 2147483647
> >>>>
> >>>>This allowed the program to run to completion. Therefore, we would
> >>>>like to increase the default maximum from 512Mbytes to 2G-1
> >>>>Gigabytes.
> >>>>Does anyone have an objection to this change? Soon we are going to
> >>>>have larger CPU counts and would like to increase the odds that
> >>>>things
> >>>>work "out of the box" on these large SMPs.
> >>>>
> >>>>
> >>>>
> >>>There is a serious problem with the 1.2 branch, it does not allocate
> >>>any SM area for each process at the beginning. SM areas are allocated
> >>>on demand and if some of the processes are more aggressive than the
> >>>others, it will cause starvation. This problem is fixed in the trunk
> >>>by assign at least one SM area for each process. I think this is what
> >>>you saw (starvation) and an increase of max size may not be necessary.
> >>>
> >>>
> >>Although I'm pretty sure this is fixed in the v1.2 branch already.
> >>
> >>
> >>
> >
> >It should never happen for the new code. The only way we can get the
> >message is when MCA_BTL_SM_FIFO_WRITE return rc != OMPI_SUCCESS, but
> >the new MCA_BTL_SM_FIFO_WRITE always return rc = OMPI_SUCCESS
> >
> >#define MCA_BTL_SM_FIFO_WRITE(endpoint_peer,
> >my_smp_rank,peer_smp_rank,hdr,rc) \
> >do { \
> > ompi_fifo_t* fifo; \
> > fifo=&(mca_btl_sm_component.fifo[peer_smp_rank][my_smp_rank]); \
> > \
> > /* thread lock */ \
> > if(opal_using_threads()) \
> > opal_atomic_lock(fifo->head_lock); \
> > /* post fragment */ \
> > while(ompi_fifo_write_to_head(hdr, fifo, \
> > mca_btl_sm_component.sm_mpool) != OMPI_SUCCESS) \
> > opal_progress(); \
> > MCA_BTL_SM_SIGNAL_PEER(endpoint_peer); \
> > rc=OMPI_SUCCESS; \
> > if(opal_using_threads()) \
> > opal_atomic_unlock(fifo->head_lock); \
> >} while(0)
> >
> >Rolf, are you using the really last 1.2 branch?
> >
> >Ollie
> >
> >
> >
> Thanks for all the input. It turns out I was originally *not* using
> the latest 1.2 branch. So, we redid the tests with the latest 1.2.
> And, I am happy to report that we no longer get the"SM failed to
> send message due to shortage of shared memory" error. However,
> now the program hangs. So, it looks like we traded one problem for
> another.
>

Can I see your test code?

Ollie