It looks like the issue is solved. Our sysadmin had been working on the issue too - he noticed a lot of "junk" in my /etc/ld.so.conf.d/ directory. After "cleaning" it out (I think he ended up wiping everything out, then rebooting the machine, then re-configuring specific items as needed), my OpenMPI installation is working fine.
I can now run "mpirun -np # hello_c" where # is any integer. The same holds true for our specialized applications (Gemini, Salinas, etc).
Apologies - I don't know why "cleaning" this directory fixed things. I'm also not sure why OpenMPI stopped working in the first place. The timing seems to coincide with two updates to my machine; the kernel, and subsequently the Nvidia driver, were both updated right before "mpirun" stopped working correctly.
The sysadmin mentioned it could be related to ldconfig. Again, I don't know why this would cause "mpirun" to misbehave. However, everything appears to work correctly now.
Thank you for your help, and hopefully this thread proves useful to someone in the future.
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Ralph Castain
Sent: Tuesday, April 12, 2011 11:38
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.4.2 Hangs When Using More Than 1 Proc
Okay, that says that mpirun is working correctly - the problem appears to be in MPI_Init.
How was OMPI configured?
On Apr 12, 2011, at 9:24 AM, Stergiou, Jonathan C CIV NSWCCD West Bethesda, 6640 wrote:
> Thanks for the reply and guidance.
> I ran the following:
> $> mpirun -np 1 hostname
> $> mpirun -np 2 hostname
> $> mpirun -np 1 ./hello_c
> Hello, world, I am 0 of 1.
> $> mpirun -np 2 ./hello_c
> (no result, terminal does not respond until ctrl+c)
>> Let's simplify the issue as we have no idea what your codes are doing.
>> Can you run two copies of hostname, for example?
>> What about multiple copies of an MPI version of "hello" - see the examples directory in the OMPI tarball.
> users mailing list
users mailing list
- application/x-pkcs7-signature attachment: smime.p7s