Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Frank Gruellich (frank.gruellich_at_[hidden])
Date: 2006-07-20 02:39:12


Hi,

Graham E Fagg wrote:
> I am not sure which alltoall your using in 1.1 so can you please run
> the ompi_info utility which is normally built and put into the same
> directory as mpirun?
>
> i.e. host% ompi_info
>
> This provides lots of really usefull info on everything before we dig
> deeper into your issue
>
>
> and then more specifically run
> host% ompi_info --param coll all

Find attached ~/notes from

 $ ( ompi_info; echo '====================='; ompi_info --param coll all ) >~/notes

Thanks in advance and kind regards,

-- 
Frank Gruellich
HPC-Techniker
Tel.:   +49 3722 528 42
Fax:    +49 3722 528 15
E-Mail: frank.gruellich_at_[hidden]
MEGWARE Computer GmbH
Vertrieb und Service
Nordstrasse 19
09247 Chemnitz/Roehrsdorf
Germany
http://www.megware.com/

                Open MPI: 1.1b1
   Open MPI SVN revision: r10217
                Open RTE: 1.1b1
   Open RTE SVN revision: r10217
                    OPAL: 1.1b1
       OPAL SVN revision: r10217
                  Prefix: /usr/ofed/mpi/intel/openmpi-1.1b1-1
 Configured architecture: x86_64-suse-linux-gnu
           Configured by: root
           Configured on: Wed Jul 19 20:51:46 CEST 2006
          Configure host: frontend
                Built by: root
                Built on: Wed Jul 19 21:04:47 CEST 2006
              Built host: frontend
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: icc
     C compiler absolute: /software/intel/cce/9.1.038/bin/icc
            C++ compiler: icpc
   C++ compiler absolute: /software/intel/cce/9.1.038/bin/icpc
      Fortran77 compiler: ifort
  Fortran77 compiler abs: /software/intel/fce/9.1.032/bin/ifort
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
              MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1)
           MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1)
           MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
           MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1)
               MCA timer: linux (MCA v1.0, API v1.0, Component v1.1)
           MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
           MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
                MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
                  MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
               MCA mpool: openib (MCA v1.0, API v1.0, Component v1.1)
               MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
                 MCA pml: dr (MCA v1.0, API v1.0, Component v1.1)
                 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
                 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
              MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: openib (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
                MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
                 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
                 MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
                 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
                 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
                  MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
                  MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
                 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                 MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: slurm (MCA v1.0, API v1.0, Component v1.1)
                 MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
                 MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
               MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
                MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
                MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
                 MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: slurm (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: slurm (MCA v1.0, API v1.0, Component v1.1)
=====================
                MCA coll: parameter "coll" (current value: <none>)
                          Default selection set of components for the coll framework (<none> means "use all components that can be found")
                MCA coll: parameter "coll_base_verbose" (current value: "0")
                          Verbosity level for the coll framework (0 = no verbosity)
                MCA coll: parameter "coll_basic_priority" (current value: "10")
                          Priority of the basic coll component
                MCA coll: parameter "coll_basic_crossover" (current value: "4")
                          Minimum number of processes in a communicator before using the logarithmic algorithms
                MCA coll: parameter "coll_hierarch_priority" (current value: "0")
                          Priority of the hierarchical coll component
                MCA coll: parameter "coll_hierarch_verbose" (current value: "0")
                          Turn verbose message of the hierarchical coll component on/off
                MCA coll: parameter "coll_hierarch_use_rdma" (current value: "0")
                          Switch from the send btl list used to detect hierarchies to the rdma btl list
                MCA coll: parameter "coll_hierarch_ignore_sm" (current value: "0")
                          Ignore sm protocol when detecting hierarchies. Required to enable the usage of protocol specific collective operations
                MCA coll: parameter "coll_hierarch_symmetric" (current value: "0")
                          Assume symmetric configuration
                MCA coll: parameter "coll_self_priority" (current value: "75")
                MCA coll: parameter "coll_sm_priority" (current value: "0")
                          Priority of the sm coll component
                MCA coll: parameter "coll_sm_control_size" (current value: "4096")
                          Length of the control data -- should usually be either the length of a cache line on most SMPs, or the size of a page on machines that support direct memory affinity page placement (in bytes)
                MCA coll: parameter "coll_sm_bootstrap_filename" (current value: "coll-sm-bootstrap")
                          Filename (in the Open MPI session directory) of the coll sm component bootstrap rendezvous mmap file
                MCA coll: parameter "coll_sm_bootstrap_num_segments" (current value: "8")
                          Number of segments in the bootstrap file
                MCA coll: parameter "coll_sm_fragment_size" (current value: "8192")
                          Fragment size (in bytes) used for passing data through shared memory (will be rounded up to the nearest control_size size)
                MCA coll: parameter "coll_sm_mpool" (current value: "sm")
                          Name of the mpool component to use
                MCA coll: parameter "coll_sm_comm_in_use_flags" (current value: "2")
                          Number of "in use" flags, used to mark a message passing area segment as currently being used or not (must be >= 2 and <= comm_num_segments)
                MCA coll: parameter "coll_sm_comm_num_segments" (current value: "8")
                          Number of segments in each communicator's shared memory message passing area (must be >= 2, and must be a multiple of comm_in_use_flags)
                MCA coll: parameter "coll_sm_tree_degree" (current value: "4")
                          Degree of the tree for tree-based operations (must be => 1 and <= min(control_size, 255))
                MCA coll: information "coll_sm_shared_mem_used_bootstrap" (value: "216")
                          Amount of shared memory used in the shared memory bootstrap area (in bytes)
                MCA coll: parameter "coll_sm_info_num_procs" (current value: "4")
                          Number of processes to use for the calculation of the shared_mem_size MCA information parameter (must be => 2)
                MCA coll: information "coll_sm_shared_mem_used_data" (value: "548864")
                          Amount of shared memory used in the shared memory data area for info_num_procs processes (in bytes)
                MCA coll: parameter "coll_tuned_priority" (current value: "30")
                          Priority of the tuned coll component
                MCA coll: parameter "coll_tuned_pre_allocate_memory_comm_size_limit" (current value: "32768")
                          Size of communicator were we stop pre-allocating memory for the fixed internal buffer used for message requests etc that is hung off the communicator data segment. I.e. if you have a 100'000 nodes you might not want to pre-allocate 200'000 request handle slots per communicator instance!
                MCA coll: parameter "coll_tuned_use_dynamic_rules" (current value: "0")
                          Switch used to decide if we use static (if statements) or dynamic (built at runtime) decision function rules
                MCA coll: parameter "coll_tuned_init_tree_fanout" (current value: "4")
                          Inital fanout used in the tree topologies for each communicator. This is only an initial guess, if a tuned collective needs a different fanout for an operation, it build it dynamically. This parameter is only for the first guess and might save a little time
                MCA coll: parameter "coll_tuned_init_chain_fanout" (current value: "4")
                          Inital fanout used in the chain (fanout followed by pipeline) topologies for each communicator. This is only an initial guess, if a tuned collective needs a different fanout for an operation, it build it dynamically. This parameter is only for the first guess and might save a little time
                MCA coll: parameter "coll_tuned_allreduce_algorithm" (current value: "0")
                          Which allreduce algorithm is used. Can be locked down to choice of: 0 ignore, 1 basic linear, 2 nonoverlapping (tuned reduce + tuned bcast)
                MCA coll: parameter "coll_tuned_allreduce_algorithm_segmentsize" (current value: "0")
                          Segment size in bytes used by default for allreduce algorithms. Only has meaning if algorithm is forced and supports segmenting. 0 bytes means no segmentation.
                MCA coll: parameter "coll_tuned_allreduce_algorithm_tree_fanout" (current value: "4")
                          Fanout for n-tree used for allreduce algorithms. Only has meaning if algorithm is forced and supports n-tree topo based operation.
                MCA coll: parameter "coll_tuned_allreduce_algorithm_chain_fanout" (current value: "4")
                          Fanout for chains used for allreduce algorithms. Only has meaning if algorithm is forced and supports chain topo based operation.
                MCA coll: parameter "coll_tuned_alltoall_algorithm" (current value: "0")
                          Which alltoall algorithm is used. Can be locked down to choice of: 0 ignore, 1 basic linear, 2 pairwise, 3: modified bruck, 4: two proc only.
                MCA coll: parameter "coll_tuned_alltoall_algorithm_segmentsize" (current value: "0")
                          Segment size in bytes used by default for alltoall algorithms. Only has meaning if algorithm is forced and supports segmenting. 0 bytes means no segmentation.
                MCA coll: parameter "coll_tuned_alltoall_algorithm_tree_fanout" (current value: "4")
                          Fanout for n-tree used for alltoall algorithms. Only has meaning if algorithm is forced and supports n-tree topo based operation.
                MCA coll: parameter "coll_tuned_alltoall_algorithm_chain_fanout" (current value: "4")
                          Fanout for chains used for alltoall algorithms. Only has meaning if algorithm is forced and supports chain topo based operation.
                MCA coll: parameter "coll_tuned_barrier_algorithm" (current value: "0")
                          Which barrier algorithm is used. Can be locked down to choice of: 0 ignore, 1 linear, 2 double ring, 3: recursive doubling 4: bruck, 5: two proc only, 6: step based bmtree
                MCA coll: parameter "coll_tuned_bcast_algorithm" (current value: "0")
                          Which bcast algorithm is used. Can be locked down to choice of: 0 ignore, 1 basic linear, 2 chain, 3: pipeline, 4: split binary tree, 5: binary tree, 6: BM tree.
                MCA coll: parameter "coll_tuned_bcast_algorithm_segmentsize" (current value: "0")
                          Segment size in bytes used by default for bcast algorithms. Only has meaning if algorithm is forced and supports segmenting. 0 bytes means no segmentation.
                MCA coll: parameter "coll_tuned_bcast_algorithm_tree_fanout" (current value: "4")
                          Fanout for n-tree used for bcast algorithms. Only has meaning if algorithm is forced and supports n-tree topo based operation.
                MCA coll: parameter "coll_tuned_bcast_algorithm_chain_fanout" (current value: "4")
                          Fanout for chains used for bcast algorithms. Only has meaning if algorithm is forced and supports chain topo based operation.
                MCA coll: parameter "coll_tuned_reduce_algorithm" (current value: "0")
                          Which reduce algorithm is used. Can be locked down to choice of: 0 ignore, 1 linear, 2 chain, 3 pipeline
                MCA coll: parameter "coll_tuned_reduce_algorithm_segmentsize" (current value: "0")
                          Segment size in bytes used by default for reduce algorithms. Only has meaning if algorithm is forced and supports segmenting. 0 bytes means no segmentation.
                MCA coll: parameter "coll_tuned_reduce_algorithm_tree_fanout" (current value: "4")
                          Fanout for n-tree used for reduce algorithms. Only has meaning if algorithm is forced and supports n-tree topo based operation.
                MCA coll: parameter "coll_tuned_reduce_algorithm_chain_fanout" (current value: "4")
                          Fanout for chains used for reduce algorithms. Only has meaning if algorithm is forced and supports chain topo based operation.