Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] FW: mpirun hangs when used on more than 2 CPUs ( mpirun compiled without thread support )
From: Theiner, Andre (andre.theiner_at_[hidden])
Date: 2012-01-19 05:51:32


Hi all,
I have to stop my investigations and repairs on the request of my customer.
I will unsubscribe from this list soon.

I found out that OpenFoam does not use threaded MPI-calls.
My next step would have been to compile openmpi-1.4.4 and have the user try this.
In case it would have also not worked I would have compiled the whole OpenFoam from the sources.
Up to now the user uses a rpm binary version of OF 2.0.1.

Thanks for all your support.

Andre

-----Original Message-----
From: Theiner, Andre
Sent: Mittwoch, 18. Januar 2012 10:15
To: 'Open MPI Users'
Subject: RE: [OMPI users] mpirun hangs when used on more than 2 CPUs ( mpirun compiled without thread support )
Importance: High

Thanks, Jeff and Ralph for your good help.
I do not know yet, whether OpenFoam uses threads with OpenMPI but I will find out.

I ran "ompi_info" and it output the lines in the next chapter.
The important line is " Thread support: posix (mpi: no, progress: no)".
At first sight the above line made me think that I found the cause of the problem
but I compared the output to the output of the same command run on another machine
where OpenFoam runs fine. The OpenMPI version of that machine is 1.3.2-1.1 and it
also does not have thread support.
The difference though is that that machine's OpenFoam version is 1.7.1 and not 2.0.1 and the
OS is SUSE Linux Enterprise Desktop 11 SP1 and not openSUSE 11.3.
So I am at the beginning of the search for the cause of the problem.

                 Package: Open MPI abuild_at_build30 Distribution
                Open MPI: 1.3.2
   Open MPI SVN revision: r21054
   Open MPI release date: Apr 21, 2009
                Open RTE: 1.3.2
   Open RTE SVN revision: r21054
   Open RTE release date: Apr 21, 2009
                    OPAL: 1.3.2
       OPAL SVN revision: r21054
       OPAL release date: Apr 21, 2009
            Ident string: 1.3.2
                  Prefix: /usr/lib64/mpi/gcc/openmpi
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: build30
           Configured by: abuild
           Configured on: Fri Sep 23 05:58:54 UTC 2011
          Configure host: build30
                Built by: abuild
                Built on: Fri Sep 23 06:11:31 UTC 2011
              Built host: build30
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no (checkpoint thread: no)
           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.2)
              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.2)
           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.2)
               MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.2)
               MCA carto: file (MCA v2.0, API v2.0, Component v1.3.2)
           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.2)
               MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.2)
         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.2)
         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.2)
              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.2)
           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.2)
           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: self (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.2)
                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.2)
                  MCA io: romio (MCA v2.0, API v2.0, Component v1.3.2)
               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.2)
               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.2)
               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA pml: csum (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA pml: v (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.2)
              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA btl: self (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.2)
                MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.2)
                MCA odls: default (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3.2)
               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.3.2)
               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.2)
              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3.2)
              MCA routed: direct (MCA v2.0, API v2.0, Component v1.3.2)
              MCA routed: linear (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.2)
               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.2)
              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA ess: env (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.2)
                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.2)
             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.2)
             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.2)

I also have requested the user to run the following adaption to his original
command "mpriun -np 9 interFoam -parallel". I hoped to get a kind of debug output
which points me into the right way. The new command did not work and I am a bit lost.
Is the syntax wrong somehow or is there a problem in the user's PATH?
I do not understand what debugger is wanted. Does mpirun not have an internal debugger?

testuser_at_caelde04:~/OpenFOAM/testuser-2.0.1/nozzleFlow2D> mpirun -v --debug --debug-daemons -np 9 interfoam -parallel
--------------------------------------------------------------------------
A suitable debugger could not be found in your PATH.
Check the values specified in the orte_base_user_debugger MCA parameter for the list of debuggers that was searched.

Gruss/Regards

Andre
Tel. 05362-936222

-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff Squyres
Sent: Dienstag, 17. Januar 2012 22:53
To: Open MPI Users
Subject: Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

You should probably also run the ompi_info command; it tells you details about your installation, and how it was configured.

Is it known that OpenFoam uses threads with MPI?

On Jan 17, 2012, at 9:08 AM, Ralph Castain wrote:

> You might first just try running a simple MPI "hello" to verify the installation. I don't know if OF is threaded or not.
>
> Sent from my iPad
>
> On Jan 17, 2012, at 5:22 AM, John Hearns <hearnsj_at_[hidden]> wrote:
>
>> Andre,
>> you should not need the OpenMPI sources.
>>
>> Install the openmpi-devel package from the same source
>> (zypper install openmpi-devel if you have that science repository enabled)
>> This will give you the mpi.h file and other include files, libraries
>> and manual pages.
>>
>> That is a convention in Suse-style distros - the devel package
>> contains the stuf you need to 'develop'
>>
>> On 17/01/2012, Theiner, Andre <andre.theiner_at_[hidden]> wrote:
>>> Hi Devendra,
>>> thanks for your interesting answer, up to now I expected to get a fully
>>> operational openmpi installation package
>>> by installing openmpi from the "science" repository (
>>> http://download.opensuse.org/repositories/science/openSUSE_11.3" ).
>>> To compile your script I need to have the openmpi sources which I do not
>>> have at present, I will try to get them.
>>> How do I compile and build using multiple processors?
>>> Is there a special flag which tells the compiler to care for multiple CPUs?
>>>
>>> Andre
>>>
>>>
>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
>>> Behalf Of devendra rai
>>> Sent: Montag, 16. Januar 2012 13:25
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] mpirun hangs when used on more than 2 CPUs
>>>
>>> Hello Andre,
>>>
>>> It may be possible that your openmpi does not support threaded MPI-calls (if
>>> these are happening). I had a similar problem, and it was traced to this
>>> cause. If you installed your openmpi from available repositories, chances
>>> are that you do not have thread-support.
>>>
>>> Here's a small script that you can use to determine whether or not you have
>>> thread support:
>>>
>>> #include <mpi.h>
>>> #include <iostream>
>>> int main(int argc, char **argv)
>>> {
>>> int myrank;
>>> int desired_thread_support = MPI_THREAD_MULTIPLE;
>>> int provided_thread_support;
>>>
>>> MPI_Init_thread(&argc, &argv, desired_thread_support,
>>> &provided_thread_support);
>>>
>>> /* check if the thread support has been provided */
>>> if (provided_thread_support!=desired_thread_support)
>>> {
>>> std::cout << "MPI thread support not available! Aborted. " <<
>>> std::endl;
>>> exit(-1);
>>> }
>>> MPI_Finalize();
>>> return 0;
>>> }
>>>
>>> Compile and build as usual, using multiple processors.
>>>
>>> Maybe this helps. If you do discover that you do not have support available,
>>> you will need to rebuild MPI with --enable-mpi-threads=yes flag.
>>>
>>> HTH.
>>>
>>>
>>> Devendra
>>>
>>> ________________________________
>>> From: "Theiner, Andre" <andre.theiner_at_[hidden]>
>>> To: "users_at_[hidden]" <users_at_[hidden]>
>>> Sent: Monday, 16 January 2012, 11:55
>>> Subject: [OMPI users] mpirun hangs when used on more than 2 CPUs
>>>
>>>
>>> Hi everyone,
>>> may I have your help on a strange problem?
>>> High performance computing is new to me and I have not much idea about
>>> OpenMPI and OpenFoam (OF) which uses the "mpirun" command.
>>> I have to support the OF application in my company and have been trying to
>>> find the problem since about 1 week.
>>> The versions are openmpi-1.3.2 and OF 2.0.1 which are running on openSUSE
>>> 11.3 x86_64.
>>> The computer is brand new, has 96 GB RAM, 12 CPUs and was installed with
>>> Linux some weeks ago.
>>> I installed OF 2.0.1 according to the vendors instructions at
>>> http://www.openfoam.org/archive/2.0.1/download/suse.php.
>>>
>>> Here the problem:
>>> The experienced user tested the OF with a test case out of one of the
>>> vendors tutorials.
>>> He only used the computing power of his local machine "caelde04" , no other
>>> computers were accessed by mpirun.
>>>
>>> He found no problem when testing in single "processor mode" but in
>>> "multiprocessor mode" his calculations hangs when he distributes
>>> the calculations to more than 2 CPUs. The OF vendor thinks this is an
>>> OpenMPI problem somehow and that is why I am trying to get
>>> help from this forum here.
>>> I attached 2 files, one is the "decomposeParDict" which resides in the
>>> "system" subdirectory of his test case and the other is the log file
>>> from the "decomposePar" command and the mpirun command "mpirun -np 9
>>> interFoam -parallel".
>>> Do you have an idea where the problem is or how I can narrow it down?
>>> Thanks much for any help.
>>>
>>> Andre
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]<mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users