Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-05-06 22:17:15


On May 6, 2010, at 8:36 PM, Gus Correa wrote:

> > BUT: a user can always override the "btl" MCA param and see them again. For example, you could also have done this:
> >
> > echo "btl =" > ~/.openmpi/mca-params.conf
> > ompi_info --all | grep btl_sm_num_fifos
> > # ...will show the sm params...
>
> Aha!
> Can they override my settings?!
> Can't anymore.
> I'm gonna write a BOFH cron script to run every 10 minutes,
> check for and delete any ~/.openmpi directory,
> shutdown the recalcitrant account, make a tarball of its ~ ,
> and send it to the mass store. Quarantined. :)

That's the spirit! I'll add this recipie to the FAQ.

;-)

> I don't think the problem is with Open MPI.
> So, it may not be easy to find a logical link between the kernel
> messages and the MPI hello_c that was running.

FWIW, we have found that the xlinpack_xeon64 benchmark binary that comes with the Intel compiler is very good at tracking down bad ram, etc. Run really large runs (as big as your memory can handle) for hours at a time.

> So, chances are that hyperthreading may give us a little edge,
> harnessing the code imperfections.
> Not a big one, maybe 10-20%, I would guess.
> I experienced that type of speedup with SMT/HT on an IBM machine
> with one of these big codes.

Indeed. I have seen some people have HT enabled in the bios just so that they can have the software option of turning them off via linux -- then you can run with HT and without it and see what it does to your specific codes.

> I've got hello_c to run correctly with heavy oversubscription on
> our cluster nodes (up to 1024 on a 8-core node IIRR).

Good.

> Heavier programs don't go this far, but still run with light
> oversubscription.

They probably are still running (even with heavy oversubscription), but probably running at an absolutely glacial rate -- because all of them are aggressively competing for cycles. When running with any level of oversubscription, try using --mca mpi_yield_when_idle 1, which makes OMPI sched_yield() inside of its progression loops. If you have MPI-heavy apps, it can make oversubscribed situations run faster than glacial.

> But on that Nehalem + Fedora 12 machine it doesn't work.
> So, the evidence is clear.
> The problem is not with Open MPI.

I'm glad it's not us, but I don't envy you trying to track down what the problem is. :-(

> >> Message from syslogd_at_spinoza at May 6 13:38:13 ...
> >> kernel:Code: 48 89 45 a0 4c 89 ff e8 e0 dd 2b 00 41 8b b6 58 03 00 00 4c 89 e7 ff c6 e8 b5 bc ff ff 41 8b 96 5c 03 00 00 48 98 48 39 d0 73 04 <0f> 0b eb fe 48 29 d0 48 89 45 a8 66 41 ff 07 49 8b 94 24 00 01
>
> I think the last one is hexa for Dante Alighieri's Inferno:
>
> "Lasciate ogni speranza voi ch'entrate"

Well done! I love it. :-)

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/