Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-05-06 15:10:34


Gus Correa wrote:

> 2) However, running with "sm" still breaks, unfortunately:
>
> I get the same errors that I reported in my very first email, if I
> increase the number of processes to 16, to explore the hyperthreading
> range.
>
> This is using "sm" (i.e. not excluded in the mca config file), and
> btl_sm_num_fifos (mpiexec command line)
>
> The machine hangs, requires a hard reboot, etc, etc, as reported earlier.

Okay. I think this is different from trac 2043, then, since that
involved a race condition that can be worked around by giving each
sender its own FIFO.

> So, I guess the conclusion is that I can use sm, but I have to remain
> within the range of physical cores (8), not oversubscribe, not try to
> explore the HT range. Should I expect it to work also for np>number
> of physical cores?

Yes, I believe that would be a reasonable expectation (under
circumstances other than the ones you're facing, in any case). I just
ran the examples/connectivity_c.c test with GCC on an 8-core Nehalem
system with HT turned on and tested up to np=64.