After trying several kernel versions. The problem is narrowed down to the
change from kernel 2.6.22 to 2.6.23. Finally I find the big change in
process scheduler in 2.6.23: the Completely Fair Scheduler.
Applications that depend *heavily* on sched_yield()'s behaviour (like, f.e.,
many benchmarks) can suffer from huge performance gains/losses due to the
very very subtle semantics of what sched_yield() should do and how CFS
changes them. There's a sysctl at /proc/sys/kernel/sched_compat_yield that
you can set to "1" to change the sched_yield() behaviour that you should try
in those cases.
After setting /proc/sys/kernel/sched_compat_yield to "1", my hybrid
application is live again.
On Tue, Oct 25, 2011 at 10:26 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> My best guess is that you are seeing differences in scheduling behavior
> with respect to memory locale. I notice that you are not binding your
> processes, and so they are free to move around the various processors on the
> node. I would guess that your thread is winding up on a processor that is
> non-local to your memory in one case, but local to your memory in the other.
> This is an OS-related scheduler decision.
> You might try binding your processes to see if it helps. With threads, you
> don't really want to bind to a core, but binding to a socket should help.
> Try adding --bind-to-socket to your mpirun cmd line (you can't do this if
> you run it as a singleton - have to use mpirun).
> On Oct 25, 2011, at 2:45 AM, åæ
> Thanks, Ralph. Yes, I have taking that into account. The problem is not to
> compare two proc with one proc, but the "multi-threading effect".
> Multi-threading is good on the first machine for one and two proc, but on
> the second machine, it disappears for two proc.
> To narrow down the problem, I reinstalled the operating system on the
> second machine from SUSE 11(kernel 22.214.171.124, gcc 4.3.4) to Red Hat 5.4
> (kernel 2.6.18, gcc 4.1.2) which is similar to the first machine (Cent OS
> 5.3, kernel 2.6.18, gcc 4.1.2). Then the problem disappears. So the problem
> must lies somewhere in OS kernel or GCC version. Any suggestions? Thanks.
> Huiwei Lv
> On Tue, Oct 25, 2011 at 3:11 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>> Okay - thanks for testing it.
>> Of course, one obvious difference is that there isn't any communication
>> when you run only one proc, but there is when you run two or more, assuming
>> your application has MPI send/recv (or calls collective and other functions
>> that communicate) calls in it. Communication to yourself is very fast as no
>> bits actually move - sending messages to another proc is considerably
>> Are you taking that into account?
>> On Oct 24, 2011, at 8:47 PM, åæ
>> No. There's a difference between "mpirun -np 1 ./my_hybrid_app..."
>> and "mpirun -np 2 ./...".
>> Run "mpirun -np 1 ./my_hybrid_app..." will increase the performance with
>> more number of threads, but run "mpirun -np 2 ./..." decrease the
>> Huiwei Lv
>> On Tue, Oct 25, 2011 at 12:00 AM, <users-request_at_[hidden]> wrote:
>>> Date: Mon, 24 Oct 2011 07:14:21 -0600
>>> From: Ralph Castain <rhc_at_[hidden]>
>>> Subject: Re: [OMPI users] Hybrid MPI/Pthreads program behaves
>>> differently on two different machines with same hardware
>>> To: Open MPI Users <users_at_[hidden]>
>>> Message-ID: <42C53D0B-1586-4001-B9D2-D77AF0033961_at_[hidden]>
>>> Content-Type: text/plain; charset="utf-8"
>>> Does the difference persist if you run the single process using mpirun?
>>> In other words, does "mpirun -np 1 ./my_hybrid_app..." behave the same as
>>> "mpirun -np 2 ./..."?
>>> There is a slight difference in the way procs start when run as
>>> singletons. It shouldn't make a difference here, but worth testing.
>>> On Oct 24, 2011, at 12:37 AM, ??? wrote:
>>> > Dear List,
>>> > I have a hybrid MPI/Pthreads program named "my_hybrid_app", this
>>> program is memory-intensive and take advantage of multi-threading to improve
>>> memory throughput. I run "my_hybrid_app" on two machines, which have same
>>> hardware configuration but different OS and GCC. The problem is: when I run
>>> "my_hybrid_app" with one process, two machines behaves the same: the more
>>> number of threads, the better the performance; however, when I run
>>> "my_hybrid_app" with two or more processes. The first machine still increase
>>> performance with more threads, the second machine degrades in performance
>>> with more threads.
>>> > Since running "my_hybrid_app" with one process behaves correctly, I
>>> suspect my linking to MPI library has some problem. Would somebody point me
>>> in the right direction? Thanks in advance.
>>> > Attached are the commandline used, my machine informantion and link
>>> > p.s. 1: Commandline
>>> > single process: ./my_hybrid_app <number of threads>
>>> > multiple process: mpirun -np 2 ./my_hybrid_app <number of threads>
>>> > p.s. 2: Machine Informantion
>>> > The first machine is CentOS 5.3 with GCC 4.1.2:
>>> > Target: x86_64-redhat-linux
>>> > Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
>>> --infodir=/usr/share/info --enable-shared --enable-threads=posix
>>> --enable-checking=release --with-system-zlib --enable-__cxa_atexit
>>> --disable-libunwind-exceptions --enable-libgcj-multifile
>>> --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
>>> --disable-dssi --enable-plugin
>>> --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-126.96.36.199/jre --with-cpu=generic
>>> > Thread model: posix
>>> > gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)
>>> > The second machine is SUSE Enterprise Server 11 with GCC 4.3.4:
>>> > Target: x86_64-suse-linux
>>> > Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
>>> --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
>>> --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3
>>> --enable-ssp --disable-libssp --with-bugurl=http://bugs.opensuse.org/--with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap
>>> --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit
>>> --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
>>> --enable-version-specific-runtime-libs --program-suffix=-4.3
>>> --enable-linux-futex --without-system-libunwind --with-cpu=generic
>>> > Thread model: posix
>>> > gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux)
>>> > p.s. 3: ldd Informantion
>>> > The first machine:
>>> > $ ldd my_hybrid_app
>>> > libm.so.6 => /lib64/libm.so.6 (0x000000358d400000)
>>> > libmpi.so.0 => /usr/local/openmpi/lib/libmpi.so.0
>>> > libopen-rte.so.0 => /usr/local/openmpi/lib/libopen-rte.so.0
>>> > libopen-pal.so.0 => /usr/local/openmpi/lib/libopen-pal.so.0
>>> > libdl.so.2 => /lib64/libdl.so.2 (0x000000358d000000)
>>> > libnsl.so.1 => /lib64/libnsl.so.1 (0x000000358f000000)
>>> > libutil.so.1 => /lib64/libutil.so.1 (0x000000359a600000)
>>> > libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00002af0d5b07000)
>>> > libpthread.so.0 => /lib64/libpthread.so.0 (0x000000358d800000)
>>> > libc.so.6 => /lib64/libc.so.6 (0x000000358cc00000)
>>> > /lib64/ld-linux-x86-64.so.2 (0x000000358c800000)
>>> > librt.so.1 => /lib64/librt.so.1 (0x000000358dc00000)
>>> > The second machine:
>>> > $ ldd my_hybrid_app
>>> > linux-vdso.so.1 => (0x00007fff3eb5f000)
>>> > libmpi.so.0 => /root/opt/openmpi/lib/libmpi.so.0
>>> > libm.so.6 => /lib64/libm.so.6 (0x00007f686254b000)
>>> > libopen-rte.so.0 => /root/opt/openmpi/lib/libopen-rte.so.0
>>> > libopen-pal.so.0 => /root/opt/openmpi/lib/libopen-pal.so.0
>>> > libdl.so.2 => /lib64/libdl.so.2 (0x00007f6861ea1000)
>>> > libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f6861c89000)
>>> > libutil.so.1 => /lib64/libutil.so.1 (0x00007f6861a86000)
>>> > libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00007f686187d000)
>>> > libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6861660000)
>>> > libc.so.6 => /lib64/libc.so.6 (0x00007f6861302000)
>>> > /lib64/ld-linux-x86-64.so.2 (0x00007f6862a58000)
>>> > librt.so.1 => /lib64/librt.so.1 (0x00007f68610f9000)
>>> > I installed openmpi-1.4.2 to a user directory /root/opt/openmpi and use
>>> "-L/root/opt/openmpi -Wl,-rpath,/root/opt/openmpi" when linking.
>>> > --
>>> > Huiwei Lv
>>> > PhD. student at Institute of Computing Technology,
>>> > Beijing, China
>>> > http://asg.ict.ac.cn/lhw