Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Matt Leininger (mlleinin_at_[hidden])
Date: 2006-11-27 23:42:37


On Mon, 2006-11-27 at 21:11 -0500, George Bosilca wrote:
> Which version of Open MPI are you using ? We can figure out what's
> wrong if we have the output of "ompi_info" and "ompi_info --param all
> all".

zeus287_at_mlleinin:./ompi_info
                Open MPI: 1.2b1svn11252006
   Open MPI SVN revision: svn11252006
                Open RTE: 1.2b1svn11252006
   Open RTE SVN revision: svn11252006
                    OPAL: 1.2b1svn11252006
       OPAL SVN revision: svn11252006
                  Prefix: /g/g12/mlleinin/src/ompi-v1.2b-112506-gcc
 Configured architecture: x86_64-unknown-linux-gnu
           Configured by: mlleinin
           Configured on: Sat Nov 25 19:17:16 PST 2006
          Configure host: rhea32
                Built by: mlleinin
                Built on: Sat Nov 25 19:27:46 PST 2006
              Built host: rhea32
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/local/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/local/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
  Internal debug support: yes
     MPI parameter check: runtime
Memory profiling support: yes
Memory debugging support: yes
         libltdl support: yes
 mpirun default --prefix: no
           MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2)
              MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2)
           MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2)
           MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2)
               MCA timer: linux (MCA v1.0, API v1.0, Component v1.2)
           MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
           MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
                MCA coll: basic (MCA v1.0, API v1.0, Component v1.2)
                MCA coll: self (MCA v1.0, API v1.0, Component v1.2)
                MCA coll: sm (MCA v1.0, API v1.0, Component v1.2)
                MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2)
                  MCA io: romio (MCA v1.0, API v1.0, Component v1.2)
               MCA mpool: openib (MCA v1.0, API v1.0, Component v1.2)
               MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2)
               MCA mpool: udapl (MCA v1.0, API v1.0, Component v1.2)
                 MCA pml: cm (MCA v1.0, API v1.0, Component v1.2)
                 MCA pml: dr (MCA v1.0, API v1.0, Component v1.2)
                 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2)
                 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2)
              MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2)
              MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2)
                 MCA btl: openib (MCA v1.0, API v1.0.1, Component v1.2)
                 MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2)
                 MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2)
                 MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
                 MCA btl: udapl (MCA v1.0, API v1.0, Component v1.2)
                MCA topo: unity (MCA v1.0, API v1.0, Component v1.2)
                 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2)
                 MCA osc: rdma (MCA v1.0, API v1.0, Component v1.2)
              MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2)
              MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2)
              MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2)
                 MCA gpr: null (MCA v1.0, API v1.0, Component v1.2)
                 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2)
                 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2)
                 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2)
                 MCA iof: svc (MCA v1.0, API v1.0, Component v1.2)
                  MCA ns: proxy (MCA v1.0, API v1.0, Component v1.2)
                  MCA ns: replica (MCA v1.0, API v1.0, Component v1.2)
                 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                 MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2)
                 MCA ras: gridengine (MCA v1.0, API v1.3, Component
v1.2)
                 MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2)
                 MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2)
                 MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2)
                 MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2)
                 MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2)
               MCA rmaps: proxy (MCA v1.0, API v1.3, Component v1.2)
               MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
v1.2)
                MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2)
                MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2)
                 MCA rml: oob (MCA v1.0, API v1.0, Component v1.2)
                 MCA pls: gridengine (MCA v1.0, API v1.3, Component
v1.2)
                 MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2)
                 MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2)
                 MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2)
                 MCA sds: env (MCA v1.0, API v1.0, Component v1.2)
                 MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2)
                 MCA sds: seed (MCA v1.0, API v1.0, Component v1.2)
                 MCA sds: singleton (MCA v1.0, API v1.0, Component v1.2)
                 MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2)

>
> I wonder if some of the memory is not related to the size of the
> shared memory file. The default way to compute the size of the shared
> memory file is defined by the MCA parameter mpool_sm_per_peer_size.
> By default is set to 128MB for each local peer. Therefore using 2048
> procs on 256 nodes lead to using 8 procs by node i.e. at least 1GB
> only for the SM file. The problem right now with the SM file is that
> we're not reusing the buffers multiple times, instead we're using a
> new fragment each time we send a message, forcing the OS to map the
> entire file at one point.

  I'll try playing with the mpool_sm_per_peer_size parameter.
 
 Thanks,

        - Matt

>
> george.
>
> On Nov 27, 2006, at 8:21 PM, Matt Leininger wrote:
>
> > On Mon, 2006-11-27 at 16:45 -0800, Matt Leininger wrote:
> >> Has anyone testing OMPI's alltoall at > 2000 MPI tasks? I'm
> >> seeing each
> >> MPI task eat up > 1GB of memory (just for OMPI - not the app).
> >
> > I gathered some more data using the alltoall benchmark in mpiBench.
> > mpiBench is pretty smart about how large its buffers are. I set it to
> > use <= 100MB.
> >
> > num nodes num MPI tasks system mem mpibench buffer mem
> > 128 1024 1 GB 65 MB
> > 160 1280 1.2 GB 82 MB
> > 192 1536 1.4 GB 98 MB
> > 224 1792 1.6 GB 57 MB
> > 256 2048 1.6-1.8 GB < 100 MB
> >
> > The 256 node run was killed by the OOM for using too much memory. For
> > all these tests the OMPI alltoall is using 1 GB or more of system
> > memory. I know LANL is looking into optimized alltoall, but is anyone
> > looking into the scalability of the memory footprint?
> >
> > Thanks,
> >
> > - Matt
> >
> >>
> >> Thanks,
> >>
> >> - Matt
> >>
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>