Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Program hangs
From: Jiaye Li (jameslipd_at_[hidden])
Date: 2009-11-20 18:09:04


Hi

I killed the job and re-submit it. At this can it can go on to run, but
today I found an even serious problem with Ompi. I compared the results of
mpich2 and ompi, finding that the results from ompi is wrong, which finished
prior to the real end. In other word, the optimized structure (by vasp) does
not converge, but it reported that the run was successful. Amasing! For the
same initial structure, run with mpich2 requires 80 ion steps, while the
run with ompi needs only 40!

On Fri, Nov 20, 2009 at 4:20 PM, vasilis gkanis
<gkanis_at_[hidden]>wrote:

> Hello,
>
> I also experience a similar problem with the MUMPS solver, when I run it on
> a
> cluster. After several hours of running the code does not produce any
> results,
> although the command top shows that the program occupies 100% of the CPU.
>
> The difference here, however, is that the same program runs fine on my PC.
> The
> differences between my PC and the cluster are:
> 1) 32bit vs 64-bit(cluster)
> 2) intel compiler vs portland compiler(cluster)
>
> Any thoughts on what might cause this?
>
> Thank you,
> Vasilis
>
>
> On Friday 20 November 2009 03:50:17 am Jiaye Li wrote:
> > Hello
> >
> > I installed openmpi-1.3.3 on my single node(cpu) intel 64bit quad-core
> > machine. The compiler info is:
> >
> >
> >
> ***************************************************************************
> > *********************************** intel-icc101018-10.1.018-1.i386
> > libgcc-4.4.0-4.i586
> > gcc-4.4.0-4.i586
> > gcc-gfortran-4.4.0-4.i586
> > gcc-c++-4.4.0-4.i586
> > intel-ifort101018-10.1.018-1.i386
> >
> > and the architecture is:
> >
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 23
> > model name : Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
> > stepping : 10
> > cpu MHz : 2825.937
> > cache size : 6144 KB
> > physical id : 0
> > siblings : 4
> > core id : 0
> > cpu cores : 4
> > apicid : 0
> > initial apicid : 0
> > fdiv_bug : no
> > hlt_bug : no
> > f00f_bug : no
> > coma_bug : no
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
> > constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl vmx smx est
> > tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm tpr_shadow vnmi
> flexpriority
> > bogomips : 5651.87
> > clflush size : 64
> > power management:
> >
> >
> ***************************************************************************
> > ***********************************
> >
> > I compiled PWscf program with openmpi and tested the program. At the
> > beginning, the execution of PW went on well, but after about 10 h, when
> > the program is going to finish. The program hang there, but the cpu time
> > is still occupied. (100% taken up by the program). There seems to be
> > something wrong, somewhere. Any ideas? Thank you in advance.
> >
> > This is the config.log of Ompi:
> >
> >
> ***************************************************************************
> > *********************************** This file contains any messages
> > produced by compilers while
> > running configure, to aid debugging if configure makes a mistake.
> >
> > It was created by Open MPI configure 1.3.3, which was
> > generated by GNU Autoconf 2.63. Invocation command line was
> >
> > $ ./configure --prefix=/opt/openmpi-1.3.3 --disable-static CC=gcc
> > FC=ifort F77=ifort --enable-shared
> >
> > ## --------- ##
> > ## Platform. ##
> > ## --------- ##
> >
> > hostname = localhost
> > uname -m = i686
> > uname -r = 2.6.29.4-167.fc11.i686.PAE
> > uname -s = Linux
> > uname -v = #1 SMP Wed May 27 17:28:22 EDT 2009
> >
> > /usr/bin/uname -p = unknown
> > /bin/uname -X = unknown
> >
> > /bin/arch = i686
> > /usr/bin/arch -k = unknown
> > /usr/convex/getsysinfo = unknown
> > /usr/bin/hostinfo = unknown
> > /bin/machine = unknown
> > /usr/bin/oslevel = unknown
> > /bin/universe = unknown
> >
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all
> > PATH: /home/jy/.wine/drive_c/windows
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src
> > PATH: /home/jy/bin/vtstscripts
> > PATH: /opt/mpich2-1.2/bin
> > PATH: /opt/intel/fc/10.1.018/bin
> > PATH: /opt/intel/cc/10.1.018/bin
> > PATH: /usr/lib/qt-3.3/bin
> > PATH: /usr/kerberos/bin
> > PATH: /usr/lib/ccache
> > PATH: /usr/local/bin
> > PATH: /usr/bin
> > PATH: /bin
> > PATH: /usr/local/sbin
> > PATH: /usr/sbin
> > PATH: /sbin
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src/scripts
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src/util
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/scripts
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/util
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/scripts
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/util
> > PATH: /home/jy/bin
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/scripts
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/util
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/scripts
> > PATH: /home/jy/Download/XCrySDen-1.5.21-src-all/util
> >
> >
> > ## ----------- ##
> > ## Core tests. ##
> > ## ----------- ##
> >
> > configure:3424: checking for a BSD-compatible install
> > configure:3492: result: /usr/bin/install -c
> > configure:3503: checking whether build environment is sane
> > configure:3546: result: yes
> > configure:3571: checking for a thread-safe mkdir -p
> > configure:3610: result: /bin/mkdir -p
> > configure:3623: checking for gawk
> > configure:3639: found /usr/bin/gawk
> > configure:3650: result: gawk
> > configure:3661: checking whether make sets $(MAKE)
> > configure:3683: result: yes
> > configure:3853: checking how to create a ustar tar archive
> > configure:3866: tar --version
> > tar (GNU tar) 1.22
> > Copyright (C) 2009 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later
> > <http://gnu.org/licenses/gpl.html
> >
> > >.
> >
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.
> >
> > .........
> >
> >
> > configure: exit 0
> >
> >
> >
> ***************************************************************************
> > ***********************************
> >
>

-- 
Sincerely yours
Jiaye Li