Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] Fw: users Digest, Vol 1217, Issue 2, Message3
From: jan (jan_at_[hidden])
Date: 2009-05-04 04:34:26


Hi Jeff,

I have updated the firmware of Infiniband module on Dell M600, but the
problem still occured.

===========================================================================

$ mpirun -hostfile clusternode -np 16 --byslot --mca btl openib,sm,self
$HOME/test/cpi
Process 1 on node1
Process 11 on node2
Process 8 on node2
Process 6 on node1
Process 4 on node1
Process 14 on node2
Process 3 on node1
Process 2 on node1
Process 9 on node2
Process 5 on node1
Process 0 on node1
Process 7 on node1
Process 10 on node2
Process 15 on node2
Process 13 on node2
Process 12 on node2
[node1][[3175,1],0][btl_openib_component.c:3029:poll_device] error polling
HP CQ with -2 errno says Success
=============================================================================

Is this problem unsolvable?

Best Regards,

 Gloria Jan
Wavelink Technology Inc

>>> I can confirm that I have exactly the same problem, also on Dell
>>> system, even with latest openpmpi.
>>>
>>> Our system is:
>>>
>>> Dell M905
>>> OpenSUSE 11.1
>>> kernel: 2.6.27.21-0.1-default
>>> ofed-1.4-21.12 from SUSE repositories.
>>> OpenMPI-1.3.2
>>>
>>>
>>> But what I can also add, it not only affect openmpi, if this messages
>>> are triggered after mpirun:
>>> [node032][[9340,1],11][btl_openib_component.c:3002:poll_device] error
>>> polling HP CQ with -2 errno says Success
>>>
>>> Then IB stack hangs. You cannot even reload it, have to reboot node.
>>>
>>
>>
>> Something that severe should not be able to be caused by Open MPI.
>> Specifically: Open MPI should not be able to hang the OFED stack.
>> Have you run layer 0 diagnostics to know that your fabric is clean?
>> You might want to contact your IB vendor to find out how to do that.
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>>
>
> On Apr 24, 2009, at 5:21 AM, jan wrote:
>
>> Dear Sir,
>>
>> I?m running a cluster with OpenMPI.
>>
>> $mpirun --mca mpi_show_mpi_alloc_mem_leaks 8 --mca
>> mpi_show_handle_leaks 1 $HOME/test/cpi
>>
>> I got the error message as job failed:
>>
>> Process 15 on node2
>> Process 6 on node1
>> Process 14 on node2
>> ? ? ?
>> Process 0 on node1
>> Process 10 on node2
>> [node2][[9340,1],13][btl_openib_component.c:3002:poll_device] error
>> polling HP C
>> Q with -2 errno says Success
>> [node2][[9340,1],9][btl_openib_component.c:3002:poll_device] error
>> polling HP CQ
>> with -2 errno says Success
>> [node2][[9340,1],10][btl_openib_component.c:3002:poll_device] error
>> polling HP C
>> Q with -2 errno says Success
>> [node2][[9340,1],11][btl_openib_component.c:3002:poll_device] error
>> polling HP C
>> Q with -2 errno says Success
>> [node2][[9340,1],8][btl_openib_component.c:3002:poll_device] error
>> polling HP CQ
>> with -2 errno says Success
>> [node2][[9340,1],15][btl_openib_component.c:3002:poll_device] [node2]
>> [[9340,1],1
>> 2][btl_openib_component.c:3002:poll_device] error polling HP CQ with
>> -2 errno sa
>> ys Success
>> error polling HP CQ with -2 errno says Success
>> [node2][[9340,1],14][btl_openib_component.c:3002:poll_device] error
>> polling HP C
>> Q with -2 errno says Success
>> mpirun: killing job...
>>
>> --------------------------------------------------------------------------
>> mpirun noticed that process rank 0 with PID 28438 on node node1
>> exited on signal
>> 0 (Unknown signal 0).
>> --------------------------------------------------------------------------
>> mpirun: clean termination accomplished
>>
>> and got the message as job success
>>
>> Process 1 on node1
>> Process 2 on node1
>> ? ? ?
>> Process 13 on node2
>> Process 14 on node2
>> --------------------------------------------------------------------------
>> The following memory locations were allocated via MPI_ALLOC_MEM but
>> not freed via MPI_FREE_MEM before invoking MPI_FINALIZE:
>>
>> Process ID: [[13692,1],12]
>> Hostname: node2
>> PID: 30183
>>
>> (null)
>> --------------------------------------------------------------------------
>> [node1:32276] 15 more processes have sent help message help-mpool-
>> base.txt / all
>> mem leaks
>> [node1:32276] Set MCA parameter "orte_base_help_aggregate" to 0 to
>> see all help
>> / error messages
>>
>>
>> It occurred periodic, ie. twice success, then twice failed, twice
>> success, then twice failed ? . I download the OFED-1.4.1-rc3 from
>> The OpenFabrics Alliance and installed on Dell PowerEdge M600 Blade
>> Server. The infiniband Mezzanine Cards is Mellanox ConnectX QDR &
>> DDR. And infiniband switch module is Mellanox M2401G. OS is CentOS
>> 5.3, kernel 2.6.18-128.1.6.el5, with PGI V7.2-5 compiler. It?s
>> running OpenSM subnet manager.
>>
>> Best Regards,
>>
>> Gloria Jan
>>
>> Wavelink Technology Inc.
>>
>> The output of the "ompi_info --all" command as:
>>
>> Package: Open MPI root_at_vortex Distribution
>> Open MPI: 1.3.1
>> Open MPI SVN revision: r20826
>> Open MPI release date: Mar 18, 2009
>> Open RTE: 1.3.1
>> Open RTE SVN revision: r20826
>> Open RTE release date: Mar 18, 2009
>> OPAL: 1.3.1
>> OPAL SVN revision: r20826
>> OPAL release date: Mar 18, 2009
>> Ident string: 1.3.1
>> MCA backtrace: execinfo (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA carto: auto_detect (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA carto: file (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA maffinity: first_use (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA installdirs: config (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA allocator: bucket (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA coll: hierarch (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA coll: self (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA io: romio (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA pml: v (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA btl: ofud (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA btl: openib (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA btl: self (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA odls: default (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA ras: tm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA rmaps: rank_file (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA rmaps: round_robin (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA routed: binomial (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA routed: direct (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA routed: linear (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA plm: tm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA errmgr: default (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA ess: env (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA ess: singleton (MCA v2.0, API v2.0, Component
>> v1.3.1)
>> MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.1)
>> MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.1)
>> Prefix: /usr/mpi/pgi/openmpi-1.3.1
>> Exec_prefix: /usr/mpi/pgi/openmpi-1.3.1
>> Bindir: /usr/mpi/pgi/openmpi-1.3.1/bin
>> Sbindir: /usr/mpi/pgi/openmpi-1.3.1/sbin
>> Libdir: /usr/mpi/pgi/openmpi-1.3.1/lib64
>> Incdir: /usr/mpi/pgi/openmpi-1.3.1/include
>> Mandir: /usr/mpi/pgi/openmpi-1.3.1/share/man
>> Pkglibdir: /usr/mpi/pgi/openmpi-1.3.1/lib64/openmpi
>> Libexecdir: /usr/mpi/pgi/openmpi-1.3.1/libexec
>> Datarootdir: /usr/mpi/pgi/openmpi-1.3.1/share
>> Datadir: /usr/mpi/pgi/openmpi-1.3.1/share
>> Sysconfdir: /usr/mpi/pgi/openmpi-1.3.1/etc
>> Sharedstatedir: /usr/mpi/pgi/openmpi-1.3.1/com
>> Localstatedir: /var Infodir: /usr/share/
>> info
>> Pkgdatadir: /usr/mpi/pgi/openmpi-1.3.1/share/openmpi
>> Pkglibdir: /usr/mpi/pgi/openmpi-1.3.1/lib64/openmpi
>> Pkgincludedir: /usr/mpi/pgi/openmpi-1.3.1/include/openmpi
>> Configured architecture: x86_64-redhat-linux-gnu
>> Configure host: vortex
>> Configured by: root
>> Configured on: Sun Apr 12 23:23:14 CST 2009
>> Configure host: vortex
>> Built by: root
>> Built on: Sun Apr 12 23:28:52 CST 2009
>> Built host: vortex
>> C bindings: yes
>> C++ bindings: yes
>> Fortran77 bindings: yes (all)
>> Fortran90 bindings: yes
>> Fortran90 bindings size: small
>> C compiler: pgcc
>> C compiler absolute: /opt/pgi/linux86-64/7.2-5/bin/pgcc
>> C char size: 1
>> C bool size: 1
>> C short size: 2
>> C int size: 4
>> C long size: 8
>> C float size: 4
>> C double size: 8
>> C pointer size: 8
>> C char align: 1
>> C bool align: 1
>> C int align: 4
>> C float align: 4
>> C double align: 8
>> C++ compiler: pgCC
>> C++ compiler absolute: /opt/pgi/linux86-64/7.2-5/bin/pgCC
>> Fortran77 compiler: pgf77
>> Fortran77 compiler abs: /opt/pgi/linux86-64/7.2-5/bin/pgf77
>> Fortran90 compiler: pgf90
>> Fortran90 compiler abs: /opt/pgi/linux86-64/7.2-5/bin/pgf90
>> Fort integer size: 4
>> Fort logical size: 4
>> Fort logical value true: -1
>> Fort have integer1: yes
>> Fort have integer2: yes
>> Fort have integer4: yes
>> Fort have integer8: yes
>> Fort have integer16: no
>> Fort have real4: yes
>> Fort have real8: yes
>> Fort have real16: no
>> Fort have complex8: yes
>> Fort have complex16: yes
>> Fort have complex32: no
>> Fort integer1 size: 1
>> Fort integer2 size: 2
>> Fort integer4 size: 4
>> Fort integer8 size: 8
>> Fort integer16 size: -1
>> Fort real size: 4
>> Fort real4 size: 4
>> Fort real8 size: 8
>> Fort real16 size: -1
>> Fort dbl prec size: 4
>> Fort cplx size: 4
>> Fort dbl cplx size: 4
>> Fort cplx8 size: 8
>> Fort cplx16 size: 16
>> Fort cplx32 size: -1
>> Fort integer align: 4
>> Fort integer1 align: 1
>> Fort integer2 align: 2
>> Fort integer4 align: 4
>> Fort integer8 align: 8
>> Fort integer16 align: -1
>> Fort real align: 4
>> Fort real4 align: 4
>> Fort real8 align: 8
>> Fort real16 align: -1
>> Fort dbl prec align: 4
>> Fort cplx align: 4
>> Fort dbl cplx align: 4
>> Fort cplx8 align: 4
>> Fort cplx16 align: 8
>> Fort cplx32 align: -1
>> C profiling: yes
>> C++ profiling: yes Thread support: posix (mpi:
>> no, progress: no)
>> Sparse Groups: no
>> Build CFLAGS: -O -DNDEBUG
>> Build CXXFLAGS: -O -DNDEBUG
>> Build FFLAGS:
>> Build FCFLAGS: -O2
>> Build LDFLAGS: -export-dynamic
>> Build LIBS: -lnsl -lutil -lpthread
>> Wrapper extra CFLAGS:
>> Wrapper extra CXXFLAGS: -fpic
>> Wrapper extra FFLAGS:
>> Wrapper extra FCFLAGS:
>> Wrapper extra LDFLAGS:
>> Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil
>> -lpthread -ldl
>> Internal debug support: no
>> MPI parameter check: runtime
>> Memory profiling support: no
>> Memory debugging support: no
>> libltdl support: yes
>> Heterogeneous support: no
>> mpirun default --prefix: yes
>> MPI I/O support: yes
>> MPI_WTIME support: gettimeofday
>> Symbol visibility support: no
>> FT Checkpoint support: no (checkpoint thread: no)
>> MCA mca: parameter "mca_param_files" (current
>> value: "/home/alpha/.openmpi/mca-params.conf:/usr/mpi/pgi/open
>> mpi-1.3.1/etc/openmpi-mca-params.conf", data source: default value)
>> Path for MCA configuration files
>> containing default parameter values
>> MCA mca: parameter
>> "mca_base_param_file_prefix" (current value: <none>, data source:
>> default value)
>> Aggregate MCA parameter file sets
>> MCA mca: parameter
>> "mca_base_param_file_path" (current value: "/usr/mpi/pgi/
>> openmpi-1.3.1/share/openmpi/amca
>> -param-sets:/home/alpha", data source: default value)
>> Aggregate MCA parameter Search path
>> MCA mca: parameter
>> "mca_base_param_file_path_force" (current value: <none>, data
>> source: default value)
>> Forced Aggregate MCA parameter Search path
>> MCA mca: parameter "mca_component_path" (current
>> value: "/usr/mpi/pgi/openmpi-1.3.1/lib64/openmpi:/home/alph
>> a/.openmpi/components", data source: default value)
>> Path where to look for Open MPI and ORTE
>> components
>> MCA mca: parameter "mca_verbose" (current value:
>> <none>, data source: default value)
>> Top-level verbosity parameter
>> MCA mca: parameter
>> "mca_component_show_load_errors" (current value: "1", data source:
>> default value)
>> Whether to show errors for components that
>> failed to load or not
>> MCA mca: parameter
>> "mca_component_disable_dlopen" (current value: "0", data source:
>> default value)
>> Whether to attempt to disable opening
>> dynamic components or not
>> MCA mpi: parameter "mpi_param_check" (current
>> value: "1", data source: default value)
>> Whether you want MPI API parameters
>> checked at run-time or not. Possible values are 0 (no checking
>> ) and 1 (perform checking at run-time)
>> MCA mpi: parameter "mpi_yield_when_idle" (current
>> value: "-1", data source: default value)
>> Yield the processor when waiting for MPI
>> communication (for MPI processes, will default to 1 when o
>> versubscribing nodes)
>> MCA mpi: parameter "mpi_event_tick_rate" (current
>> value: "-1", data source: default value)
>> How often to progress TCP communications
>> (0 = never, otherwise specified in microseconds)
>> MCA mpi: parameter "mpi_show_handle_leaks" (current
>> value: "1", data source: environment)
>> Whether MPI_FINALIZE shows all MPI handles
>> that were not freed or not
>> MCA mpi: parameter "mpi_no_free_handles" (current
>> value: "0", data source: environment)
>> Whether to actually free MPI objects when
>> their handles are freed
>> MCA mpi: parameter
>> "mpi_show_mpi_alloc_mem_leaks" (current value: "8", data source:
>> environment)
>> If >0, MPI_FINALIZE will show up to this
>> many instances of memory allocated by MPI_ALLOC_MEM that w
>> as not freed by MPI_FREE_MEM
>> MCA mpi: parameter "mpi_show_mca_params" (current
>> value: <none>, data source: default value)
>> Whether to show all MCA parameter values
>> during MPI_INIT or not (good for reproducability of MPI jo
>> bs for debug purposes). Accepted values are all, default, file, api,
>> and enviro - or a comma delimited combination of them
>> MCA mpi: parameter
>> "mpi_show_mca_params_file" (current value: <none>, data source:
>> default value)
>> If mpi_show_mca_params is true, setting
>> this string to a valid filename tells Open MPI to dump all
>> the MCA parameter values into a file suitable for reading via the
>> mca_param_files parameter (good for reproducability of MPI
>> jobs)
>> MCA mpi: parameter
>> "mpi_keep_peer_hostnames" (current value: "1", data source: default
>> value)
>> If nonzero, save the string hostnames of
>> all MPI peer processes (mostly for error / debugging outpu
>> t messages). This can add quite a bit of memory usage to each MPI
>> process.
>> MCA mpi: parameter "mpi_abort_delay" (current
>> value: "0", data source: default value)
>> If nonzero, print out an identifying
>> message when MPI_ABORT is invoked (hostname, PID of the proces
>> s that called MPI_ABORT) and delay for that many seconds before
>> exiting (a negative delay value means to never abort). This
>> allows attaching of a debugger before quitting the job.
>> MCA mpi: parameter "mpi_abort_print_stack" (current
>> value: "0", data source: default value)
>> If nonzero, print out a stack trace when
>> MPI_ABORT is invoked
>> MCA mpi: parameter "mpi_preconnect_mpi" (current
>> value: "0", data source: default value, synonyms: mpi_preco
>> nnect_all)
>> Whether to force MPI processes to fully
>> wire-up the MPI connections between MPI processes during MP
>> I_INIT (vs. making connections lazily -- upon the first MPI traffic
>> between each process peer pair)
>> MCA mpi: parameter "mpi_preconnect_all" (current
>> value: "0", data source: default value, deprecated, synonym
>> of: mpi_preconnect_mpi)
>> Whether to force MPI processes to fully
>> wire-up the MPI connections between MPI processes during MP
>> I_INIT (vs. making connections lazily -- upon the first MPI traffic
>> between each process peer pair)
>> MCA mpi: parameter "mpi_leave_pinned" (current
>> value: "0", data source: environment)
>> Whether to use the "leave pinned" protocol
>> or not. Enabling this setting can help bandwidth perfor
>> mance when repeatedly sending and receiving large messages with the
>> same buffers over RDMA-based networks (0 = do not use "le
>> ave pinned" protocol, 1 = use "leave pinned" protocol, -1 = allow
>> network to choose at runtime).
>> MCA mpi: parameter
>> "mpi_leave_pinned_pipeline" (current value: "0", data source:
>> default value)
>> Whether to use the "leave pinned pipeline"
>> protocol or not.
>> MCA mpi: parameter "mpi_paffinity_alone" (current
>> value: "0", data source: default value)
>> If nonzero, assume that this job is the
>> only (set of) process(es) running on each node and bind pro
>> cesses to processors, starting with processor ID 0
>> MCA mpi: parameter "mpi_warn_on_fork" (current
>> value: "1", data source: default value)
>> If nonzero, issue a warning if program
>> forks under conditions that could cause system errors
>> MCA mpi: information
>> "mpi_have_sparse_group_storage" (value: "0", data source: default
>> value)
>> Whether this Open MPI installation
>> supports storing of data in MPI groups in "sparse" formats (good
>> for extremely large process count MPI jobs that create many
>> communicators/groups)
>> MCA mpi: parameter
>> "mpi_use_sparse_group_storage" (current value: "0", data source:
>> default value)
>> Whether to use "sparse" storage formats
>> for MPI groups (only relevant if mpi_have_sparse_group_storage is 1)
>> MCA orte: parameter
>> "orte_base_help_aggregate" (current value: "1", data source: default
>> value)
>> If orte_base_help_aggregate is true,
>> duplicate help messages will be aggregated rather than display
>> ed individually. This can be helpful for parallel jobs that
>> experience multiple identical failures; rather than print out th
>> e same help/failure message N times, display it once with a count of
>> how many processes sent the same message.
>> MCA orte: parameter "orte_tmpdir_base" (current
>> value: <none>, data source: default value)
>> Base of the session directory tree
>> MCA orte: parameter "orte_no_session_dirs" (current
>> value: <none>, data source: default value)
>> Prohibited locations for session
>> directories (multiple locations separated by ',', default=NULL)
>> MCA orte: parameter "orte_debug" (current value:
>> "0", data source: default value)
>> Top-level ORTE debug switch (default
>> verbosity: 1)
>> MCA orte: parameter "orte_debug_verbose" (current
>> value: "-1", data source: default value)
>> Verbosity level for ORTE debug messages
>> (default: 1)
>> MCA orte: parameter "orte_debug_daemons" (current
>> value: "0", data source: default value)
>> Whether to debug the ORTE daemons or not
>> MCA orte: parameter
>> "orte_debug_daemons_file" (current value: "0", data source: default
>> value)
>> Whether want stdout/stderr of daemons to
>> go to a file or not
>> MCA orte: parameter
>> "orte_leave_session_attached" (current value: "0", data source:
>> default value)
>> Whether applications and/or daemons should
>> leave their sessions attached so that any output can be
>> received - this allows X forwarding without all the attendant
>> debugging output
>> MCA orte: parameter "orte_do_not_launch" (current
>> value: "0", data source: default value)
>> Perform all necessary operations to
>> prepare to launch the application, but do not actually launch it
>> MCA orte: parameter "orte_daemon_spin" (current
>> value: "0", data source: default value)
>> Have any orteds spin until we can connect
>> a debugger to them
>> MCA orte: parameter "orte_daemon_fail" (current
>> value: "-1", data source: default value)
>> Have the specified orted fail after init
>> for debugging purposes
>> MCA orte: parameter
>> "orte_daemon_fail_delay" (current value: "0", data source: default
>> value)
>> Have the specified orted fail after
>> specified number of seconds (default: 0 => no delay)
>> MCA orte: parameter "orte_heartbeat_rate" (current
>> value: "0", data source: default value)
>> Seconds between checks for daemon state-of-
>> health (default: 0 => do not check)
>> MCA orte: parameter "orte_startup_timeout" (current
>> value: "0", data source: default value)
>> Milliseconds/daemon to wait for startup
>> before declaring failed_to_start (default: 0 => do not chec
>> Fortran77 profiling: yes
>> Fortran90 profiling: yes
>> C++ exceptions: nok)
>> MCA orte: parameter "orte_timing" (current value:
>> "0", data source: default value)
>> Request that critical timing loops be
>> measured
>> MCA orte: parameter
>> "orte_base_user_debugger" (current value: "totalview @mpirun@ -a
>> @mpirun_args@ : ddt -n @
>> np@ -start @executable@ @executable_argv@ @single_app@ : fxp
>> @mpirun@ -a @mpirun_args@", data source: default value)
>> Sequence of user-level debuggers to search
>> for in orterun
>> MCA orte: parameter "orte_abort_timeout" (current
>> value: "1", data source: default value)
>> Max time to wait [in secs] before aborting
>> an ORTE operation (default: 1sec)
>> MCA orte: parameter "orte_timeout_step" (current
>> value: "1000", data source: default value)
>> Time to wait [in usecs/proc] before
>> aborting an ORTE operation (default: 1000 usec/proc)
>> MCA orte: parameter "orte_default_hostfile" (current
>> value: <none>, data source: default value)
>> Name of the default hostfile (relative or
>> absolute path)
>> MCA orte: parameter
>> "orte_keep_fqdn_hostnames" (current value: "0", data source: default
>> value)
>> Whether or not to keep FQDN hostnames
>> [default: no]
>> MCA orte: parameter "orte_contiguous_nodes" (current
>> value: "2147483647", data source: default value)
>> Number of nodes after which contiguous
>> nodename encoding will automatically be used [default: INT_MAX]
>> MCA orte: parameter "orte_tag_output" (current
>> value: "0", data source: default value)
>> Tag all output with [job,rank] (default:
>> false)
>> MCA orte: parameter "orte_xml_output" (current
>> value: "0", data source: default value)
>> Display all output in XML format (default:
>> false)
>> MCA orte: parameter "orte_timestamp_output" (current
>> value: "0", data source: default value)
>> Timestamp all application process output
>> (default: false)
>> MCA orte: parameter "orte_output_filename" (current
>> value: <none>, data source: default value)
>> Redirect output from application processes
>> into filename.rank [default: NULL]
>> MCA orte: parameter
>> "orte_show_resolved_nodenames" (current value: "0", data source:
>> default value)
>> Display any node names that are resolved
>> to a different name (default: false)
>> MCA orte: parameter "orte_hetero_apps" (current
>> value: "0", data source: default value)
>> Indicates that multiple app_contexts are
>> being provided that are a mix of 32/64 bit binaries (default: false)
>> MCA orte: parameter "orte_launch_agent" (current
>> value: "orted", data source: default value)
>> Command used to start processes on remote
>> nodes (default: orted)
>> MCA orte: parameter
>> "orte_allocation_required" (current value: "0", data source: default
>> value)
>> Whether or not an allocation by a resource
>> manager is required [default: no]
>> MCA orte: parameter "orte_xterm" (current value:
>> <none>, data source: default value)
>> Create a new xterm window and display
>> output from the specified ranks there [default: none]
>> MCA orte: parameter
>> "orte_forward_job_control" (current value: "0", data source: default
>> value)
>> Forward SIGTSTP (after converting to
>> SIGSTOP) and SIGCONT signals to the application procs [default: no]
>> MCA opal: parameter "opal_signal" (current value:
>> "6,7,8,11", data source: default value)
>> If a signal is received, display the stack
>> trace frame
>> MCA opal: parameter
>> "opal_set_max_sys_limits" (current value: "0", data source: default
>> value)
>> Set to non-zero to automatically set any
>> system-imposed limits to the maximum allowed
>> MCA opal: parameter "opal_event_include" (current
>> value: "poll", data source: default value)
>> ... ... ...