Jeff,
OK ... I rebuilt without --with-tm= and as predicted my test case
runs (I left the IB flags in). I then ran a job with just:
pbsdsh hostname
on 16 nodes and that also worked. I know that 1.4.1 works although
it was build pointing into the old PBS Pro version tree explicitly. I
have checked and rechecked the environmental variables and everything
else that could lead to some mixed-up version cross referencing.
I am tempted to build 1.4.2 with the explicit -with-tm= version path
instead of using the symlink to default, but I cannot think of a logical
reason why that should do anything.
I have also reported this to the PBS Pro support folks.
Thanks for the suggestions,
rbw
Richard Walsh
Parallel Applications and Systems Manager
CUNY HPC Center, Staten Island, NY
718-982-3319
612-382-4620
Mighty the Wizard
Who found me at sunrise
Sleeping, and woke me
And learn'd me Magic!
________________________________________
From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] On Behalf Of Jeff Squyres [jsquyres_at_[hidden]]
Sent: Thursday, June 10, 2010 6:34 PM
To: Open MPI Users
Subject: Re: [OMPI users] Address not mapped segmentation fault with1.4.2 ...
On Jun 10, 2010, at 5:49 PM, Richard Walsh wrote:
> OK ... so if I follow your lead and build a version without PBS --tm= integration
> and it works, I should be able to report this as an incompatibility bug between
> the latest version of PBS Pro (10.2.0.93147) and the latest version of OpenMPI
> (1.4.2). right? Do I report that you to my friends at OpenMPI or my friends at
> PBS Pro (Altair), or both?
I'd say both.
But it would be quite surprising if tm_init() it wholly broken -- it's the very first function that has to be invoked.
I'm not a PBS user, so I don't know/remember the PBS commands offhand, but I have a dim recollection of a few PBS-provided TM-using tools (pbsdsh or somesuch?). You might want to try those, too, and see if they work/fail.
If it really is a problem, I'm guessing it'll be a compiler/linker issue somehow... (e.g., how we're compiling/linking is not matching the compilation/linker style of the TM library) That's a SWAG. :-)
--
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users
Think green before you print this email.
|