Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Hybrid MPI/Pthreads program behaves differently on two different machines with same hardware
From: 吕慧伟 (lvhuiwei_at_[hidden])
Date: 2011-10-25 04:45:25


Thanks, Ralph. Yes, I have taking that into account. The problem is not to
compare two proc with one proc, but the "multi-threading effect".
Multi-threading is good on the first machine for one and two proc, but on
the second machine, it disappears for two proc.

To narrow down the problem, I reinstalled the operating system on the second
machine from SUSE 11(kernel 2.6.32.12, gcc 4.3.4) to Red Hat 5.4
(kernel 2.6.18, gcc 4.1.2) which is similar to the first machine (Cent OS
5.3, kernel 2.6.18, gcc 4.1.2). Then the problem disappears. So the problem
must lies somewhere in OS kernel or GCC version. Any suggestions? Thanks.

--
Huiwei Lv
On Tue, Oct 25, 2011 at 3:11 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> Okay - thanks for testing it.
>
> Of course, one obvious difference is that there isn't any communication
> when you run only one proc, but there is when you run two or more, assuming
> your application has MPI send/recv (or calls collective and other functions
> that communicate) calls in it. Communication to yourself is very fast as no
> bits actually move - sending messages to another proc is considerably
> slower.
>
> Are you taking that into account?
>
>
> On Oct 24, 2011, at 8:47 PM, 吕慧伟 wrote:
>
> No. There's a difference between "mpirun -np 1 ./my_hybrid_app..."
> and "mpirun -np 2 ./...".
>
> Run "mpirun -np 1 ./my_hybrid_app..." will increase the performance with
> more number of threads, but run "mpirun -np 2 ./..." decrease the
> performance.
>
> --
> Huiwei Lv
>
> On Tue, Oct 25, 2011 at 12:00 AM, <users-request_at_[hidden]> wrote:
>
>>
>> Date: Mon, 24 Oct 2011 07:14:21 -0600
>> From: Ralph Castain <rhc_at_[hidden]>
>> Subject: Re: [OMPI users] Hybrid MPI/Pthreads program behaves
>>        differently on  two different machines with same hardware
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <42C53D0B-1586-4001-B9D2-D77AF0033961_at_[hidden]>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Does the difference persist if you run the single process using mpirun? In
>> other words, does "mpirun -np 1 ./my_hybrid_app..." behave the same as
>> "mpirun -np 2 ./..."?
>>
>> There is a slight difference in the way procs start when run as
>> singletons. It shouldn't make a difference here, but worth testing.
>>
>> On Oct 24, 2011, at 12:37 AM, ??? wrote:
>>
>> > Dear List,
>> >
>> > I have a hybrid MPI/Pthreads program named "my_hybrid_app", this program
>> is memory-intensive and take advantage of multi-threading to improve memory
>> throughput. I run "my_hybrid_app" on two machines, which have same hardware
>> configuration but different OS and GCC. The problem is: when I run
>> "my_hybrid_app" with one process, two machines behaves the same: the more
>> number of threads, the better the performance; however, when I run
>> "my_hybrid_app" with two or more processes. The first machine still increase
>> performance with more threads, the second machine degrades in performance
>> with more threads.
>> >
>> > Since running "my_hybrid_app" with one process behaves correctly, I
>> suspect my linking to MPI library has some problem. Would somebody point me
>> in the right direction? Thanks in advance.
>> >
>> > Attached are the commandline used, my machine informantion and link
>> informantion.
>> > p.s. 1: Commandline
>> > single process: ./my_hybrid_app <number of threads>
>> > multiple process: mpirun -np 2 ./my_hybrid_app <number of threads>
>> >
>> > p.s. 2: Machine Informantion
>> > The first machine is CentOS 5.3 with GCC 4.1.2:
>> > Target: x86_64-redhat-linux
>> > Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
>> --infodir=/usr/share/info --enable-shared --enable-threads=posix
>> --enable-checking=release --with-system-zlib --enable-__cxa_atexit
>> --disable-libunwind-exceptions --enable-libgcj-multifile
>> --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
>> --disable-dssi --enable-plugin
>> --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic
>> --host=x86_64-redhat-linux
>> > Thread model: posix
>> > gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)
>> > The second machine is SUSE Enterprise Server 11 with GCC 4.3.4:
>> > Target: x86_64-suse-linux
>> > Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
>> --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
>> --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
>> --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3
>> --enable-ssp --disable-libssp --with-bugurl=http://bugs.opensuse.org/--with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap
>> --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit
>> --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
>> --enable-version-specific-runtime-libs --program-suffix=-4.3
>> --enable-linux-futex --without-system-libunwind --with-cpu=generic
>> --build=x86_64-suse-linux
>> > Thread model: posix
>> > gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux)
>> >
>> > p.s. 3: ldd Informantion
>> > The first machine:
>> > $ ldd my_hybrid_app
>> >         libm.so.6 => /lib64/libm.so.6 (0x000000358d400000)
>> >         libmpi.so.0 => /usr/local/openmpi/lib/libmpi.so.0
>> (0x00002af0d53a7000)
>> >         libopen-rte.so.0 => /usr/local/openmpi/lib/libopen-rte.so.0
>> (0x00002af0d564a000)
>> >         libopen-pal.so.0 => /usr/local/openmpi/lib/libopen-pal.so.0
>> (0x00002af0d5895000)
>> >         libdl.so.2 => /lib64/libdl.so.2 (0x000000358d000000)
>> >         libnsl.so.1 => /lib64/libnsl.so.1 (0x000000358f000000)
>> >         libutil.so.1 => /lib64/libutil.so.1 (0x000000359a600000)
>> >         libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00002af0d5b07000)
>> >         libpthread.so.0 => /lib64/libpthread.so.0 (0x000000358d800000)
>> >         libc.so.6 => /lib64/libc.so.6 (0x000000358cc00000)
>> >         /lib64/ld-linux-x86-64.so.2 (0x000000358c800000)
>> >         librt.so.1 => /lib64/librt.so.1 (0x000000358dc00000)
>> > The second machine:
>> > $ ldd my_hybrid_app
>> >         linux-vdso.so.1 =>  (0x00007fff3eb5f000)
>> >         libmpi.so.0 => /root/opt/openmpi/lib/libmpi.so.0
>> (0x00007f68627a1000)
>> >         libm.so.6 => /lib64/libm.so.6 (0x00007f686254b000)
>> >         libopen-rte.so.0 => /root/opt/openmpi/lib/libopen-rte.so.0
>> (0x00007f68622fc000)
>> >         libopen-pal.so.0 => /root/opt/openmpi/lib/libopen-pal.so.0
>> (0x00007f68620a5000)
>> >         libdl.so.2 => /lib64/libdl.so.2 (0x00007f6861ea1000)
>> >         libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f6861c89000)
>> >         libutil.so.1 => /lib64/libutil.so.1 (0x00007f6861a86000)
>> >         libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00007f686187d000)
>> >         libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6861660000)
>> >         libc.so.6 => /lib64/libc.so.6 (0x00007f6861302000)
>> >         /lib64/ld-linux-x86-64.so.2 (0x00007f6862a58000)
>> >         librt.so.1 => /lib64/librt.so.1 (0x00007f68610f9000)
>> > I installed openmpi-1.4.2 to a user directory /root/opt/openmpi and use
>> "-L/root/opt/openmpi -Wl,-rpath,/root/opt/openmpi" when linking.
>> > --
>> > Huiwei Lv
>> > PhD. student at Institute of Computing Technology,
>> > Beijing, China
>> > http://asg.ict.ac.cn/lhw
>
>