Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1
From: Reuti (reuti_at_[hidden])
Date: 2013-06-19 14:42:50


Am 19.06.2013 um 19:43 schrieb Riccardo Murri <riccardo.murri_at_[hidden]>:

> On 19 June 2013 16:01, Ralph Castain <rhc_at_[hidden]> wrote:
>> How is OMPI picking up this hostfile? It isn't being specified on the cmd line - are you running under some resource manager?
>
> Via the environment variable `OMPI_MCA_orte_default_hostfile`.
>
> We're running under SGE, but disable the OMPI/SGE integration (rather

It's disabled by default, you would have to activate it during `configure` of Open MPI.

> old version of SGE, does not coordinate well with OpenMPI);

In what sense? What do you observe in case you use it? The `qrsh` startup is working fine for a long time now.

-- Reuti

> here's the
> relevant snippet from our startup script:
>
> # the OMPI/SGE integration does not seem to work with
> # our SGE version; so use the `mpi` PE and direct OMPI
> # to look for a "plain old" machine file
> unset PE_HOSTFILE
> if [ -r "${TMPDIR}/machines" ]; then
> OMPI_MCA_orte_default_hostfile="${TMPDIR}/machines"
> export OMPI_MCA_orte_default_hostfile
> fi
> GMSCOMMAND="$openmpi_root/bin/mpiexec -n $NCPUS --nooversubscribe
> $gamess $INPUT -scr $(pwd)"
>
> The `$TMPDIR/machines` hostfile is created from SGE's $PE_HOSTFILE by
> extracting the host names, and repeating each one for the given number
> of slots (unmodified code that comes with SGE):
>
> PeHostfile2MachineFile()
> {
> cat $1 | while read line; do
> # echo $line
> host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
> nslots=`echo $line|cut -f2 -d" "`
> i=1
> while [ $i -le $nslots ]; do
> echo $host
> i=`expr $i + 1`
> done
> done
> }
>
> Thanks,
> Riccardo
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users