On 10/20/2010 8:30 PM, Scott Atchley wrote:
> On Oct 20, 2010, at 9:22 PM, Raymond Muno wrote:
>> On 10/20/2010 7:59 PM, Ralph Castain wrote:
>>> The error message seems to imply that mpirun itself didn't segfault, but that something else did. Is that segfault pid from mpirun?
>>> This kind of problem usually is caused by mismatched builds - i.e., you compile against your new build, but you pick up the Myrinet build when you try to run because of path and ld_library_path issues. You might check to ensure you are running against what you built with.
>> The PATH and LD_LIBRARY_PATH are set explicitly (through modules) on the frontend and each node. The PGI compiler and the OpenMPI I am trying to run is set for each.
> Are you building OMPI with support for both MX and IB? If not and you only want MX support, try configuring OMPI using --disable-memory-manager (check configure for the exact option).
> We have fixed this bug in the most recent 1.4.x and 1.5.x releases.
I just downloaded 1.4.3 and compiled it with PGI 10.4. I get the same
I did confirm that the process ID shown is that of mpirun.
This cluster only has Myrinet. The install is separate from the IB
cluster and a fresh build. I will try the configure option.