Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI job initializing problem
From: Beichuan Yan (beichuan.yan_at_[hidden])
Date: 2014-03-03 16:48:58


Hi,

1. After sysadmin installed libibverbs-devel package, I build Open MPI 1.7.4 successfully with the command:
./configure --prefix=/work4/projects/openmpi/openmpi-1.7.4-gcc-compilers-4.7.3 --with-tm=/opt/pbs/default --with-verbs=/hafs_x86_64/devel/usr --with-verbs-libdir=/hafs_x86_64/devel/usr/lib64

2. Then I rebuild and run my job in hybrid MPI/OPENMP mode: each compute node only runs 1 process (this 1 process runs 16 OPENMP threads), it can get initialized and run well each time with $TCP setting as follows, this is great:
TCP="--mca btl_tcp_if_include 10.148.0.0/16"
mpirun $TCP -np 16 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt

3. Then I test pure-MPI mode: OPENMP is turned off, and each compute node runs 16 processes (clearly shared-memory of MPI is used). Four combinations of "TMPDIR" and "TCP" are tested:
case 1:
#export TMPDIR=/home/yanb/tmp
TCP="--mca btl_tcp_if_include 10.148.0.0/16"
mpirun $TCP -np 64 -npernode 16 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt
output:
Start Prologue v2.5 Mon Mar 3 15:47:16 EST 2014
End Prologue v2.5 Mon Mar 3 15:47:16 EST 2014
-bash: line 1: 448597 Terminated /var/spool/PBS/mom_priv/jobs/602244.service12.SC
Start Epilogue v2.5 Mon Mar 3 15:50:51 EST 2014
Statistics cpupercent=0,cput=00:00:00,mem=7028kb,ncpus=128,vmem=495768kb,walltime=00:03:24
End Epilogue v2.5 Mon Mar 3 15:50:52 EST 2014

case 2:
#export TMPDIR=/home/yanb/tmp
#TCP="--mca btl_tcp_if_include 10.148.0.0/16"
mpirun $TCP -np 64 -npernode 16 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt
output:
WARNING: Open MPI will create a shared memory backing file in a
directory that appears to be mounted on a network filesystem.
Creating the shared memory backup file on a network file system, such
as NFS or Lustre is not recommended -- it may cause excessive network
traffic to your file servers and/or cause shared memory traffic in
Open MPI to be much slower than expected.
You may want to check what the typical temporary directory is on your
node. Possible sources of the location of this temporary directory
include the $TEMPDIR, $TEMP, and $TMP environment variables.
Note, too, that system administrators can set a list of filesystems
where Open MPI is disallowed from creating temporary files by setting
the MCA parameter "orte_no_session_dir".
  Local host: r18i3n3
  Fileame: /work3/yanb/602252.SPIRIT/openmpi-sessions-yanb_at_r18i3n3_0/17232/1/shared_mem_pool.r18i3n3

case 3:
export TMPDIR=/home/yanb/tmp
#TCP="--mca btl_tcp_if_include 10.148.0.0/16"
mpirun $TCP -np 64 -npernode 16 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt
output:
Start Prologue v2.5 Mon Mar 3 16:00:12 EST 2014
End Prologue v2.5 Mon Mar 3 16:00:13 EST 2014
-bash: line 1: 479958 Terminated /var/spool/PBS/mom_priv/jobs/602260.service12.SC
Start Epilogue v2.5 Mon Mar 3 16:03:59 EST 2014
Statistics cpupercent=0,cput=00:00:00,mem=4872kb,ncpus=128,vmem=496216kb,walltime=00:03:38
End Epilogue v2.5 Mon Mar 3 16:04:00 EST 2014

case 4:
export TMPDIR=/home/yanb/tmp
TCP="--mca btl_tcp_if_include 10.148.0.0/16"
mpirun $TCP -np 64 -npernode 16 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt
output:
Start Prologue v2.5 Mon Mar 3 16:04:55 EST 2014
End Prologue v2.5 Mon Mar 3 16:04:56 EST 2014
-bash: line 1: 480449 Terminated /var/spool/PBS/mom_priv/jobs/602265.service12.SC
Start Epilogue v2.5 Mon Mar 3 16:13:31 EST 2014
Statistics cpupercent=0,cput=00:00:00,mem=7008kb,ncpus=128,vmem=496216kb,walltime=00:08:21
End Epilogue v2.5 Mon Mar 3 16:13:32 EST 2014

The problem is that my job cannot get initialized for any of the 4 cases of pure-MPI mode by Open MPI.

However, with Intel MPI the pure-MPI mode works very well, I don't need to worry about anything like shared-memory or IB subnet:
mpirun -np 64 -perhost 16 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt

So I am wondering if Open MPI has any problem/solution with the shared-memory transfer?

Thanks,
Beichuan

-----Original Message-----
From: Beichuan Yan
Sent: Sunday, March 02, 2014 00:56
To: 'Open MPI Users'
Subject: RE: [OMPI users] OpenMPI job initializing problem

Ralph and Gus,

1. Thank you for your suggestion. I built Open MPI 1.6.5 with the following command:
./configure --prefix=/work4/projects/openmpi/openmpi-1.6.5-gcc-compilers-4.7.3 --with-tm=/opt/pbs/default --with-openib= --with-openib-libdir=/usr/lib64

In my job script, I need to specify the IB subnet like this:
TCP="--mca btl_tcp_if_include 10.148.0.0/16"
mpirun $TCP -np 64 -hostfile $PBS_NODEFILE ./paraEllip3d input.txt

Then my job can get initialized and run correctly each time!

2. However, to build Open MPI 1.7.4 with another command (in order to test/compare shared-memory performance of Open MPI):
./configure --prefix=/work4/projects/openmpi/openmpi-1.7.4-gcc-compilers-4.7.3 --with-tm=/opt/pbs/default --with-verbs= --with-verbs-libdir=/usr/lib64

It gets error as follows:
============================================================================
== Modular Component Architecture (MCA) setup ============================================================================
checking for subdir args... '--prefix=/work4/projects/openmpi/openmpi-1.7.4-gcc-compilers-4.7.3' '--with-tm=/opt/pbs/default' '--with-verbs=' '--with-verbs-libdir=/usr/lib64' 'CC=gcc' 'CXX=g++'
checking --with-verbs value... simple ok (unspecified) checking --with-verbs-libdir value... sanity check ok (/usr/lib64)
configure: WARNING: Could not find verbs.h in the usual locations under
configure: error: Cannot continue

Our system is Red Hat 6.4. Do we need to install more packages of Infiniband? Can you please advise?

Thanks,
Beichuan Yan

-----Original Message-----
From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Gus Correa
Sent: Friday, February 28, 2014 15:59
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI job initializing problem

HI Beichuan

To add to what Ralph said,
the RHEL OpenMPI package probably wasn't built with with PBS Pro support either.
Besides, OMPI 1.5.4 (RHEL version) is old.

**

You will save yourself time and grief if you read the installation FAQs, before you install from the source tarball:

http://www.open-mpi.org/faq/?category=building

However, as Ralph said, that is your best bet, and it is quite easy to get right.

See this FAQ on how to build with PBS Pro support:

http://www.open-mpi.org/faq/?category=building#build-rte-tm

And this one on how to build with Infiniband support:

http://www.open-mpi.org/faq/?category=building#build-p2p

Here is how to select the installation directory (--prefix):

http://www.open-mpi.org/faq/?category=building#easy-build

Here is how to select the compilers (gcc,g++, and gfortran are fine):

http://www.open-mpi.org/faq/?category=building#build-compilers

I hope this helps,
Gus Correa

On 02/28/2014 12:36 PM, Ralph Castain wrote:
> Almost certainly, the redhat package wasn't built with matching
> infiniband support and so we aren't picking it up. I'd suggest
> downloading the latest 1.7.4 or 1.7.5 nightly tarball, or even the
> latest 1.6 tarball if you want the stable release, and build it
> yourself so you *know* it was built for your system.
>
>
> On Feb 28, 2014, at 9:20 AM, Beichuan Yan <beichuan.yan_at_[hidden]
> <mailto:beichuan.yan_at_[hidden]>> wrote:
>
>> Hi there,
>> I am running jobs on clusters with Infiniband connection. They
>> installed OpenMPI v1.5.4 via REDHAT 6 yum package). My problem is
>> that although my jobs gets queued and started by PBS PRO quickly,
>> most of the time they don't really run (occasionally they really run)
>> and give error info like this (even though there are a lot of CPU/IB
>> resource
>> available):
>> [r2i6n7][[25564,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_com
>> plete_connect]
>> connect() to 192.168.159.156 failed: Connection refused (111) And
>> even though when a job gets started and runs well, it prompts this
>> error:
>> ---------------------------------------------------------------------
>> -----
>> WARNING: There was an error initializing an OpenFabrics device.
>> Local host: r1i2n6
>> Local device: mlx4_0
>> ---------------------------------------------------------------------
>> ----- 1. Here is the info from one of the compute nodes:
>> -bash-4.1$ /sbin/ifconfig
>> eth0 Link encap:Ethernet HWaddr 8C:89:A5:E3:D2:96 inet
>> addr:192.168.159.205 Bcast:192.168.159.255 Mask:255.255.255.0
>> inet6 addr: fe80::8e89:a5ff:fee3:d296/64 Scope:Link UP BROADCAST
>> RUNNING MULTICAST MTU:1500 Metric:1 RX packets:48879864 errors:0
>> dropped:0 overruns:17 frame:0 TX packets:39286060 errors:0 dropped:0
>> overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:54771093645 (51.0 GiB) TX bytes:37512462596 (34.9 GiB)
>> Memory:dfc00000-dfc20000
>> Ifconfig uses the ioctl access method to get the full address
>> information, which limits hardware addresses to 8 bytes.
>> Because Infiniband address has 20 bytes, only the first 8 bytes are
>> displayed correctly.
>> Ifconfig is obsolete! For replacement check ip.
>> ib0 Link encap:InfiniBand HWaddr
>> 80:00:00:48:FE:C0:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>> inet addr:10.148.0.114 Bcast:10.148.255.255 Mask:255.255.0.0
>> inet6 addr: fe80::202:c903:fb:3489/64 Scope:Link UP BROADCAST RUNNING
>> MULTICAST MTU:65520 Metric:1 RX packets:43807414 errors:0 dropped:0
>> overruns:0 frame:0 TX packets:10534050 errors:0 dropped:24 overruns:0
>> carrier:0
>> collisions:0 txqueuelen:256
>> RX bytes:47824448125 (44.5 GiB) TX bytes:44764010514 (41.6 GiB) lo
>> Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0
>> inet6 addr: ::1/128 Scope:Host
>> UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:17292 errors:0
>> dropped:0 overruns:0 frame:0 TX packets:17292 errors:0 dropped:0
>> overruns:0 carrier:0
>> collisions:0 txqueuelen:0
>> RX bytes:1492453 (1.4 MiB) TX bytes:1492453 (1.4 MiB) -bash-4.1$
>> chkconfig --list iptables iptables 0:off 1:off 2:on 3:on 4:on 5:on
>> 6:off 2. I tried various parameters below but none of them can assure
>> my jobs get initialized and run:
>> #TCP="--mca btl ^tcp"
>> #TCP="--mca btl self,openib"
>> #TCP="--mca btl_tcp_if_exclude lo"
>> #TCP="--mca btl_tcp_if_include eth0"
>> #TCP="--mca btl_tcp_if_include eth0, ib0"
>> #TCP="--mca btl_tcp_if_exclude 192.168.0.0/24,127.0.0.1/8 --mca
>> oob_tcp_if_exclude 192.168.0.0/24,127.0.0.1/8"
>> #TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>> mpirun $TCP -hostfile $PBS_NODEFILE -np 8 ./paraEllip3d input.txt 3.
>> Then I turned to Intel MPI, which surprisingly starts and runs my job
>> correctly each time (though it is a little slower than OpenMPI, maybe
>> 15% slower, but it works each time).
>> Can you please advise? Many thanks.
>> Sincerely,
>> Beichuan Yan
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users