Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Orion Poplawski (orion_at_[hidden])
Date: 2006-10-20 11:45:25


Reuti wrote:
> Hi,
>
> Am 20.10.2006 um 01:08 schrieb Orion Poplawski:
>
>> I'm starting to test out OpenMPI 1.2 tight integration with SGE and
>> have run into the following issue. Currently, my startmpi script
>> massages the hostnames in the machines file created from the SGE
>> pe_hostfile add an "x" suffix on machines that are connected with a
>> separate GigE network dedicated for MPI traffic.
>>
>> With tight integration, openmpi uses the SGE pe_hostfile directly, e.g.:
>>
>> coop00.cora.nwra.com 2 coop.q_at_[hidden] <NULL>
>> coop01.cora.nwra.com 2 coop.q_at_[hidden] <NULL>
>>
>> Now, how/can I modify this so that MPI traffic speaks to coop00x and
>> coop01x? One immediate problem that I'm running into is that the
>> startmpi script from the SGE PE runs as the user of the job so it
>> can't modify pe_hostfile.
>
> is the name of the pe_hostfile hardcoded, to point to the one in the
> nodes spool directory, or is OpenMPI using the $PE_HOSTFILE, which you
> could reset to a new name to point to a modified one? Another issue
> might be the back-channel of the communication, where sometimes simply
> the `hostname` of the sender is taken to answer.

(Sending this to the openmpi-devel list as well I see what insight they
may have. This seems like a common use case.)

It uses $PE_HOSTFILE, so I made a startup script that created a new
pe_hostfile. This requires something like the following in my job script:

setenv PE_HOSTFILE $TMPDIR/pe_hostfile
orterun -np $NSLOTS $*

which is unfortunate that it can't be handled automatically somehow.

First tried:

coop01x.cora.nwra.com 2 coop.q_at_[hidden] <NULL>
coop00x.cora.nwra.com 2 coop.q_at_[hidden] <NULL>

Which yielded:

error: commlib error: access denied (client IP resolved to host name
"coop01x.cora.nwra.com". This is not identical to clients host name
"coop01.cora.nwra.com")
error: executing task of job 41354 failed: failed sending task to
execd_at_[hidden]: can't find connection
[coop01:27468] ERROR: A daemon on node coop00x.cora.nwra.com failed to
start as expected.
[coop01:27468] ERROR: There may be more information available from
[coop01:27468] ERROR: the 'qstat -t' command on the Grid Engine tasks.
[coop01:27468] ERROR: If the problem persists, please restart the
[coop01:27468] ERROR: Grid Engine PE job
[coop01:27468] ERROR: The daemon exited unexpectedly with status 1.
error: commlib error: access denied (client IP resolved to host name
"coop01x.cora.nwra.com". This is not identical to clients host name
"coop01.cora.nwra.com")
error: executing task of job 41354 failed: failed sending task to
execd_at_[hidden]: can't find connection

Then:

coop01x.cora.nwra.com 2 coop.q_at_[hidden] <NULL>
coop00x.cora.nwra.com 2 coop.q_at_[hidden] <NULL>

which yields:

error: commlib error: access denied (client IP resolved to host name
"coop01x.cora.nwra.com". This is not identical to clients host name
"coop01.cora.nwra.com")
error: executing task of job 41356 failed: failed sending task to
execd_at_[hidden]: can't find connection
error: commlib error: access denied (client IP resolved to host name
"coop01x.cora.nwra.com". This is not identical to clients host name
"coop01.cora.nwra.com")
[coop01:27945] ERROR: A daemon on node coop01x.cora.nwra.com failed to
start as expected.
[coop01:27945] ERROR: There may be more information available from
[coop01:27945] ERROR: the 'qstat -t' command on the Grid Engine tasks.
[coop01:27945] ERROR: If the problem persists, please restart the
[coop01:27945] ERROR: Grid Engine PE job
[coop01:27945] ERROR: The daemon exited unexpectedly with status 1.
error: executing task of job 41356 failed: failed sending task to
execd_at_[hidden]: can't find connection

Now, looking at the OpenMPI gridengine code, it looks like it gets the
node name from the first entry in the pe_hostfile, and never really uses
the queue name for anything.

         ptr = strtok_r(buf, " \n", &tok);
         num = strtok_r(NULL, " \n", &tok);
         queue = strtok_r(NULL, " \n", &tok);
         arch = strtok_r(NULL, " \n", &tok);
...
         node->node_name = strdup(ptr);
         node->node_arch = strdup(arch);

Perhaps it can be modified it uses the queue name hostname when doing
SGE/qrsh calls, but the first hostname when doing MPI communication.
Not really sure what the intent of the two fields in SGE's pe_hostfile
is, or if OpenMPI can handle the idea of two hostnames for different
purposes.

-- 
Orion Poplawski
System Administrator                  303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  orion_at_[hidden]
Boulder, CO 80301              http://www.cora.nwra.com