Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Noam Meltzer (noam_at_[hidden])
Date: 2007-08-22 08:31:20


I am running openmpi-1.2.3 compiled for 64bit on RHEL4u4.
I also have a Voltaire InfiniBand interconnect.
When I manually run jobs using the following command:

/opt/local/openmpi-1.2.3-gcc4/bin/orterun -np 8 -hostfile ~/myHostList
-mca btl self,openib /tcc/eandm/performance/igor/main.exe.openmpi123

The job is executed just fine..

Though, when run through SGE I have the weirdest problem, and get the
following error (on all hosts in my list):
The OpenIB BTL failed to initialize while trying to create an internal
queue. This typically indicates a failed OpenFabrics installation or
faulty hardware. The failure occured here:

    OMPI source: btl_openib.c:828
    Function: ibv_create_cq()
    Error: Invalid argument (errno=22)
    Device: mthca0

You may need to consult with your system administrator to get this
problem fixed.

To send a job to the grid I use the following command:
qrsh -cwd -q noam.q -pe orte 8 ./myScript

while "myScript" looks like:

/opt/local/openmpi-1.2.3-gcc4/bin/orterun -np $NSLOTS -mca btl
self,openib /tcc/eandm/performance/igor/main.exe.openmpi123

If I change "openib" to "tcp" (in myScript) everything works just fine.

Any ideas?

Best regards,
Noam Meltzer
Software Support Engineer & RHCE
E&M Computing