Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-09-17 08:08:52


It sounds like we have a missed corner case of the OMPI run-time not
cleaning properly. I know one case like this came up recently (if an
app calls exit() without calling MPI_FINALIZE, OMPI v1.2.x hangs) and
Ralph is working on it.

This could well be what is happening here...?

Do you know how your process is exiting? If a process dies via
signal, OMPI *should* be seeing that and cleaning up the whole job
properly.

On Sep 12, 2007, at 10:50 PM, Daniel Rozenbaum wrote:

> Hello,
>
> I'm working on an MPI application for which I recently started
> using Open MPI instead of LAM/MPI. Both with Open MPI and LAM/MPI
> it mostly runs ok, but there're a number of cases under which the
> application terminates abnormally when using LAM/MPI, and hangs
> when using Open MPI. I haven't been able to reduce the example
> reproducing the problem, so every time it takes about an hour of
> running time before the application hangs. It hangs right before
> it's supposed to end properly. The master and all the slave
> processes are showing in "top" consuming 100% CPU. The application
> just hangs there like that until I interrupt it.
>
> Here's the command line:
>
> orterun --prefix /path/to/openmpi -mca btl tcp,self -x PATH -x
> LD_LIBRARY_PATH --hostfile hostfile1 /path/to/app_executable <app
> params>
>
> hostfile1:
>
> host1 slots=3
> host2 slots=4
> host3 slots=4
> host4 slots=4
> host5 slots=4
> host6 slots=4
> host7 slots=4
> host8 slots=4
> host9 slots=4
> host10 slots=4
> host11 slots=4
> host12 slots=4
> host13 slots=4
> host14 slots=4
>
> Each host is a dual-CPU dual-core Intel box running Red Hat
> Enterprise Server 4.
>
>
> I caught the following error messages on app's stderr during the run:
>
> [host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed with errno=110
> [host8][0,1,29][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed with errno=113
> <later>
> [host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed with errno=110
>
>
> Excerpts from strace output, and ompi_info are attached below.
> Any advice would be greatly appreciated!
> Thanks in advance,
> Daniel
>
>
> strace on the orterun process:
>
> poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN},
> {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12,
> events=POLLIN}, {fd=13, events=POLLIN}, {fd=14, events=POLLIN},
> {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17,
> events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN},
> {fd=20, events=POLLIN}, {fd=0, events=POLLIN}, {fd=21,
> events=POLLIN}, {fd=22, events=POLLIN}, {fd=23, events=POLLIN},
> {fd=24, events=POLLIN}, {fd=25, events=POLLIN}, {fd=26,
> events=POLLIN}, {fd=27, events=POLLIN}, {fd=28, events=POLLIN},
> {fd=29, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31,
> events=POLLIN}, {fd=34, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=32, events=POLLIN}, {fd=35, events=POLLIN}, ...], 71, 1000) = 0
> rt_sigprocmask(SIG_BLOCK, [INT USR1 USR2 TERM CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGTERM, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGINT, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGUSR1, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGUSR2, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> sched_yield() = 0
> rt_sigprocmask(SIG_BLOCK, [INT USR1 USR2 TERM CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGTERM, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGINT, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGUSR1, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigaction(SIGUSR2, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD],
> SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [INT USR1 USR2 TERM CHLD], NULL, 8) = 0
> poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=5, events=POLL
>
>
>
> strace on the master process:
>
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
> poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=14, events=POLLIN},
> {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13,
> events=POLLIN}, {fd=16, events=POLLIN}, {fd=15, events=POLLIN},
> {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22,
> events=POLLIN}, {fd=23, events=POLLIN}, {fd=67, events=POLLIN},
> {fd=25, events=POLLIN}, {fd=66, events=POLLIN}, {fd=26,
> events=POLLIN}, {fd=27, events=POLLIN}, {fd=28, events=POLLIN},
> {fd=29, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31,
> events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=34, events=POLLIN}, {fd=35, events=POLLIN}, {fd=36,
> events=POLLIN}, {fd=37, events=POLLIN}, {fd=38, events=POLLIN},
> {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, ...], 56, 0) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
> poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=14, events=POLLIN},
> {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13,
> events=POLLIN}, {fd=16, events=POLLIN}, {fd=15, events=POLLIN},
> {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22,
> events=POLLIN}, {fd=23, events=POLLIN}, {fd=67, events=POLLIN},
> {fd=25, events=POLLIN}, {fd=66, events=POLLIN}, {fd=26,
> events=POLLIN}, {fd=27, events=POLLIN}, {fd=28, events=POLLIN},
> {fd=29, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31,
> events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=34, events=POLLIN}, {fd=35, events=POLLIN}, {fd=36,
> events=POLLIN}, {fd=37, events=POLLIN}, {fd=38, events=POLLIN},
> {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, ...], 56, 0) = 0
>
>
>
> strace on one of the slaves:
>
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3c71c0c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3c71c0c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
> poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=11, events=POLLIN},
> {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15,
> events=POLLIN}, {fd=14, events=POLLIN}, {fd=16, events=POLLIN},
> {fd=17, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19,
> events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN},
> {fd=22, events=POLLIN}, {fd=23, events=POLLIN}], 17, 0) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3c71c0c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|
> SA_RESTART, 0x3c71c0c4f0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
> poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=11, events=POLLIN},
> {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15,
> events=POLLIN}, {fd=14, events=POLLIN}, {fd=16, events=POLLIN},
> {fd=17, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19,
> events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN},
> {fd=22, events=POLLIN}, {fd=23, events=POLLIN}], 17, 0) = 0
>
>
>
>
> ompi_info --all:
>
>
> Open MPI: 1.2.3
> Open MPI SVN revision: r15136
> Open RTE: 1.2.3
> Open RTE SVN revision: r15136
> OPAL: 1.2.3
> OPAL SVN revision: r15136
> MCA backtrace: execinfo (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.3)
> MCA maffinity: first_use (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA maffinity: libnuma (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.3)
> MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.3)
> MCA installdirs: config (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.3)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.2.3)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.3)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.3)
> MCA io: romio (MCA v1.0, API v1.0, Component v1.2.3)
> MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.3)
> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.3)
> MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.3)
> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.3)
> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.3)
> MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.3)
> MCA btl: self (MCA v1.0, API v1.0.1, Component
> v1.2.3)
> MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.3)
> MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.3)
> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.3)
> MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.3)
> MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.3)
> MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.3)
> MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.3)
> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.3)
> MCA gpr: replica (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.3)
> MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.3)
> MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.3)
> MCA ns: replica (MCA v1.0, API v2.0, Component
> v1.2.3)
> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA ras: dash_host (MCA v1.0, API v1.3, Component
> v1.2.3)
> MCA ras: gridengine (MCA v1.0, API v1.3, Component
> v1.2.3)
> MCA ras: localhost (MCA v1.0, API v1.3, Component
> v1.2.3)
> MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.3)
> MCA rds: hostfile (MCA v1.0, API v1.3, Component
> v1.2.3)
> MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.3)
> MCA rds: resfile (MCA v1.0, API v1.3, Component
> v1.2.3)
> MCA rmaps: round_robin (MCA v1.0, API v1.3,
> Component v1.2.3)
> MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.3)
> MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.3)
> MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.3)
> MCA pls: gridengine (MCA v1.0, API v1.3, Component
> v1.2.3)
> MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.3)
> MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.3)
> MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.3)
> MCA sds: env (MCA v1.0, API v1.0, Component v1.2.3)
> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.3)
> MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.3)
> MCA sds: singleton (MCA v1.0, API v1.0, Component
> v1.2.3)
> MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.3)
> Prefix: /path/to/openmpi
> Bindir: /path/to/openmpi/bin
> Libdir: /path/to/openmpi/lib
> Incdir: /path/to/openmpi/include
> Pkglibdir: /path/to/openmpi/lib/openmpi
> Sysconfdir: /path/to/openmpi/etc
> Configured architecture: x86_64-unknown-linux-gnu
> Configured by: user1
> Configured on: Tue Sep 11 15:57:23 EDT 2007
> Configure host: host1.domain.com
> Built by: user1
> Built on: Tue Sep 11 16:09:44 EDT 2007
> Built host: host1.domain.com
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: yes (all)
> Fortran90 bindings: no
> Fortran90 bindings size: na
> C compiler: gcc
> C compiler absolute: /usr/bin/gcc
> C char size: 1
> C bool size: 1
> C short size: 2
> C int size: 4
> C long size: 8
> C float size: 4
> C double size: 8
> C pointer size: 8
> C char align: 1
> C bool align: 1
> C int align: 4
> C float align: 4
> C double align: 8
> C++ compiler: g++
> C++ compiler absolute: /usr/bin/g++
> Fortran77 compiler: g77
> Fortran77 compiler abs: /usr/bin/g77
> Fortran90 compiler: none
> Fortran90 compiler abs: none
> Fort integer size: 4
> Fort logical size: 4
> Fort logical value true: 1
> Fort have integer1: yes
> Fort have integer2: yes
> Fort have integer4: yes
> Fort have integer8: yes
> Fort have integer16: no
> Fort have real4: yes
> Fort have real8: yes
> Fort have real16: no
> Fort have complex8: yes
> Fort have complex16: yes
> Fort have complex32: no
> Fort integer1 size: 1
> Fort integer2 size: 2
> Fort integer4 size: 4
> Fort integer8 size: 8
> Fort integer16 size: -1
> Fort real size: 4
> Fort real4 size: 4
> Fort real8 size: 8
> Fort real16 size: -1
> Fort dbl prec size: 4
> Fort cplx size: 4
> Fort dbl cplx size: 4
> Fort cplx8 size: 8
> Fort cplx16 size: 16
> Fort cplx32 size: -1
> Fort integer align: 4
> Fort integer1 align: 1
> Fort integer2 align: 2
> Fort integer4 align: 4
> Fort integer8 align: 8
> Fort integer16 align: -1
> Fort real align: 4
> Fort real4 align: 4
> Fort real8 align: 8
> Fort real16 align: -1
> Fort dbl prec align: 4
> Fort cplx align: 4
> Fort dbl cplx align: 4
> Fort cplx8 align: 4
> Fort cplx16 align: 8
> Fort cplx32 align: -1
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: no
> C++ exceptions: no
> Thread support: posix (mpi: no, progress: no)
> Build CFLAGS: -O3 -DNDEBUG -finline-functions -fno-
> strict-aliasing -pthread
> Build CXXFLAGS: -O3 -DNDEBUG -finline-functions -pthread
> Build FFLAGS:
> Build FCFLAGS:
> Build LDFLAGS: -export-dynamic
> Build LIBS: -lnsl -lutil -lm
> Wrapper extra CFLAGS: -pthread
> Wrapper extra CXXFLAGS: -pthread
> Wrapper extra FFLAGS: -pthread
> Wrapper extra FCFLAGS: -pthread
> Wrapper extra LDFLAGS:
> Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -
> lutil -lm -ldl
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> Heterogeneous support: yes
> mpirun default --prefix: no
> MCA mca: parameter "mca_param_files" (current
> value: "/home/user1/.openmpi/mca-params.conf:/path/to/openmpi/etc/
> openmpi-mca-params.conf")
> Path for MCA configuration files
> containing default parameter values
> MCA mca: parameter "mca_component_path" (current
> value: "/path/to/openmpi/lib/openmpi:/home/user1/.openmpi/components")
> Path where to look for Open MPI and ORTE
> components
> MCA mca: parameter "mca_verbose" (current value:
> <none>)
> Top-level verbosity parameter
> MCA mca: parameter
> "mca_component_show_load_errors" (current value: "1")
> Whether to show errors for components
> that failed to load or not
> MCA mca: parameter
> "mca_component_disable_dlopen" (current value: "0")
> Whether to attempt to disable opening
> dynamic components or not
> MCA mpi: parameter "mpi_param_check" (current
> value: "1")
> Whether you want MPI API parameters
> checked at run-time or not. Possible values are 0 (no checking)
> and 1 (perform checking at run-time)
> MCA mpi: parameter "mpi_yield_when_idle" (current
> value: "0")
> Yield the processor when waiting for MPI
> communication (for MPI processes, will default to 1 when
> oversubscribing nodes)
> MCA mpi: parameter "mpi_event_tick_rate" (current
> value: "-1")
> How often to progress TCP communications
> (0 = never, otherwise specified in microseconds)
> MCA mpi: parameter
> "mpi_show_handle_leaks" (current value: "0")
> Whether MPI_FINALIZE shows all MPI
> handles that were not freed or not
> MCA mpi: parameter "mpi_no_free_handles" (current
> value: "0")
> Whether to actually free MPI objects when
> their handles are freed
> MCA mpi: parameter "mpi_show_mca_params" (current
> value: "0")
> Whether to show all MCA parameter value
> during MPI_INIT or not (good for reproducability of MPI jobs)
> MCA mpi: parameter
> "mpi_show_mca_params_file" (current value: <none>)
> If mpi_show_mca_params is true, setting
> this string to a valid filename tells Open MPI to dump all the MCA
> parameter values into a file suitable for reading via the
> mca_param_files parameter (good for reproducability of MPI jobs)
> MCA mpi: parameter "mpi_paffinity_alone" (current
> value: "0")
> If nonzero, assume that this job is the
> only (set of) process(es) running on each node and bind processes
> to processors, starting with processor ID 0
> MCA mpi: parameter
> "mpi_keep_peer_hostnames" (current value: "1")
> If nonzero, save the string hostnames of
> all MPI peer processes (mostly for error / debugging output
> messages). This can add quite a bit of memory usage to each MPI
> process.
> MCA mpi: parameter "mpi_abort_delay" (current
> value: "0")
> If nonzero, print out an identifying
> message when MPI_ABORT is invoked (hostname, PID of the process
> that called MPI_ABORT) and delay for that many seconds before
> exiting (a negative delay value means to never abort). This allows
> attaching of a debugger before quitting the job.
> MCA mpi: parameter
> "mpi_abort_print_stack" (current value: "0")
> If nonzero, print out a stack trace when
> MPI_ABORT is invoked
> MCA mpi: parameter "mpi_preconnect_all" (current
> value: "0")
> Whether to force MPI processes to create
> connections / warmup with *all* peers during MPI_INIT (vs. making
> connections lazily -- upon the first MPI traffic between each
> process peer pair)
> MCA mpi: parameter "mpi_preconnect_oob" (current
> value: "0")
> Whether to force MPI processes to fully
> wire-up the OOB system between MPI processes.
> MCA mpi: parameter "mpi_leave_pinned" (current
> value: "0")
> Whether to use the "leave pinned"
> protocol or not. Enabling this setting can help bandwidth
> performance when repeatedly sending and receiving large messages
> with the same buffers over RDMA-based networks.
> MCA mpi: parameter
> "mpi_leave_pinned_pipeline" (current value: "0")
> Whether to use the "leave pinned
> pipeline" protocol or not.
> MCA orte: parameter "orte_debug" (current value: "0")
> Top-level ORTE debug switch
> MCA orte: parameter "orte_no_daemonize" (current
> value: "0")
> Whether to properly daemonize the ORTE
> daemons or not
> MCA orte: parameter
> "orte_base_user_debugger" (current value: "totalview @mpirun@ -a
> @mpirun_args@ : fxp @mpirun@ -a @mpirun_args@")
> Sequence of user-level debuggers to
> search for in orterun
> MCA orte: parameter "orte_abort_timeout" (current
> value: "10")
> Time to wait [in seconds] before giving
> up on aborting an ORTE operation
> MCA orte: parameter "orte_timing" (current value: "0")
> Request that critical timing loops be
> measured
> MCA opal: parameter "opal_signal" (current value:
> "6,7,8,11")
> If a signal is received, display the
> stack trace frame
> MCA backtrace: parameter "backtrace" (current value:
> <none>)
> Default selection set of components for
> the backtrace framework (<none> means "use all components that can
> be found")
> MCA backtrace: parameter
> "backtrace_base_verbose" (current value: "0")
> Verbosity level for the backtrace
> framework (0 = no verbosity)
> MCA backtrace: parameter
> "backtrace_execinfo_priority" (current value: "0")
> MCA memory: parameter "memory" (current value: <none>)
> Default selection set of components for
> the memory framework (<none> means "use all components that can be
> found")
> MCA memory: parameter "memory_base_verbose" (current
> value: "0")
> Verbosity level for the memory framework
> (0 = no verbosity)
> MCA memory: parameter
> "memory_ptmalloc2_priority" (current value: "0")
> MCA paffinity: parameter "paffinity" (current value:
> <none>)
> Default selection set of components for
> the paffinity framework (<none> means "use all components that can
> be found")
> MCA paffinity: parameter
> "paffinity_linux_priority" (current value: "10")
> Priority of the linux paffinity component
> MCA paffinity: information
> "paffinity_linux_have_cpu_set_t" (value: "1")
> Whether this component was compiled on a
> system with the type cpu_set_t or not (1 = yes, 0 = no)
> MCA paffinity: information
> "paffinity_linux_CPU_ZERO_ok" (value: "1")
> Whether this component was compiled on a
> system where CPU_ZERO() is functional or broken (1 = functional, 0
> = broken/not available)
> MCA paffinity: information
> "paffinity_linux_sched_setaffinity_num_params" (value: "3")
> The number of parameters that
> sched_set_affinity() takes on the machine where this component was
> compiled
> MCA maffinity: parameter "maffinity" (current value:
> <none>)
> Default selection set of components for
> the maffinity framework (<none> means "use all components that can
> be found")
> MCA maffinity: parameter
> "maffinity_first_use_priority" (current value: "10")
> Priority of the first_use maffinity
> component
> MCA maffinity: parameter
> "maffinity_libnuma_priority" (current value: "25")
> Priority of the libnuma maffinity component
> MCA timer: parameter "timer" (current value: <none>)
> Default selection set of components for
> the timer framework (<none> means "use all components that can be
> found")
> MCA timer: parameter "timer_base_verbose" (current
> value: "0")
> Verbosity level for the timer framework
> (0 = no verbosity)
> MCA timer: parameter "timer_linux_priority" (current
> value: "0")
> MCA allocator: parameter "allocator" (current value:
> <none>)
> Default selection set of components for
> the allocator framework (<none> means "use all components that can
> be found")
> MCA allocator: parameter
> "allocator_base_verbose" (current value: "0")
> Verbosity level for the allocator
> framework (0 = no verbosity)
> MCA allocator: parameter
> "allocator_basic_priority" (current value: "0")
> MCA allocator: parameter
> "allocator_bucket_num_buckets" (current value: "30")
> MCA allocator: parameter
> "allocator_bucket_priority" (current value: "0")
> MCA coll: parameter "coll" (current value: <none>)
> Default selection set of components for
> the coll framework (<none> means "use all components that can be
> found")
> MCA coll: parameter "coll_base_verbose" (current
> value: "0")
> Verbosity level for the coll framework (0
> = no verbosity)
> MCA coll: parameter "coll_basic_priority" (current
> value: "10")
> Priority of the basic coll component
> MCA coll: parameter "coll_basic_crossover" (current
> value: "4")
> Minimum number of processes in a
> communicator before using the logarithmic algorithms
> MCA coll: parameter "coll_self_priority" (current
> value: "75")
> MCA coll: parameter "coll_sm_priority" (current
> value: "0")
> Priority of the sm coll component
> MCA coll: parameter "coll_sm_control_size" (current
> value: "4096")
> Length of the control data -- should
> usually be either the length of a cache line on most SMPs, or the
> size of a page on machines that support direct memory affinity page
> placement (in bytes)
> MCA coll: parameter
> "coll_sm_bootstrap_filename" (current value:
> "shared_mem_sm_bootstrap")
> Filename (in the Open MPI session
> directory) of the coll sm component bootstrap rendezvous mmap file
> MCA coll: parameter
> "coll_sm_bootstrap_num_segments" (current value: "8")
> Number of segments in the bootstrap file
> MCA coll: parameter
> "coll_sm_fragment_size" (current value: "8192")
> Fragment size (in bytes) used for passing
> data through shared memory (will be rounded up to the nearest
> control_size size)
> MCA coll: parameter "coll_sm_mpool" (current value:
> "sm")
> Name of the mpool component to use
> MCA coll: parameter
> "coll_sm_comm_in_use_flags" (current value: "2")
> Number of "in use" flags, used to mark a
> message passing area segment as currently being used or not (must
> be >= 2 and <= comm_num_segments)
> MCA coll: parameter
> "coll_sm_comm_num_segments" (current value: "8")
> Number of segments in each communicator's
> shared memory message passing area (must be >= 2, and must be a
> multiple of comm_in_use_flags)
> MCA coll: parameter "coll_sm_tree_degree" (current
> value: "4")
> Degree of the tree for tree-based
> operations (must be => 1 and <= min(control_size, 255))
> MCA coll: information
> "coll_sm_shared_mem_used_bootstrap" (value: "216")
> Amount of shared memory used in the
> shared memory bootstrap area (in bytes)
> MCA coll: parameter
> "coll_sm_info_num_procs" (current value: "4")
> Number of processes to use for the
> calculation of the shared_mem_size MCA information parameter (must
> be => 2)
> MCA coll: information
> "coll_sm_shared_mem_used_data" (value: "548864")
> Amount of shared memory used in the
> shared memory data area for info_num_procs processes (in bytes)
> MCA coll: parameter "coll_tuned_priority" (current
> value: "30")
> Priority of the tuned coll component
> MCA coll: parameter
> "coll_tuned_pre_allocate_memory_comm_size_limit" (current value:
> "32768")
> Size of communicator were we stop pre-
> allocating memory for the fixed internal buffer used for message
> requests etc that is hung off the communicator data segment. I.e.
> if you have a 100'000 nodes you might not want to pre-allocate
> 200'000 request handle slots per communicator instance!
> MCA coll: parameter
> "coll_tuned_init_tree_fanout" (current value: "4")
> Inital fanout used in the tree topologies
> for each communicator. This is only an initial guess, if a tuned
> collective needs a different fanout for an operation, it build it
> dynamically. This parameter is only for the first guess and might
> save a little time
> MCA coll: parameter
> "coll_tuned_init_chain_fanout" (current value: "4")
> Inital fanout used in the chain (fanout
> followed by pipeline) topologies for each communicator. This is
> only an initial guess, if a tuned collective needs a different
> fanout for an operation, it build it dynamically. This parameter is
> only for the first guess and might save a little time
> MCA coll: parameter
> "coll_tuned_use_dynamic_rules" (current value: "0")
> Switch used to decide if we use static
> (compiled/if statements) or dynamic (built at runtime) decision
> function rules
> MCA io: parameter
> "io_base_freelist_initial_size" (current value: "16")
> Initial MPI-2 IO request freelist size
> MCA io: parameter
> "io_base_freelist_max_size" (current value: "64")
> Max size of the MPI-2 IO request freelist
> MCA io: parameter
> "io_base_freelist_increment" (current value: "16")
> Increment size of the MPI-2 IO request
> freelist
> MCA io: parameter "io" (current value: <none>)
> Default selection set of components for
> the io framework (<none> means "use all components that can be found")
> MCA io: parameter "io_base_verbose" (current
> value: "0")
> Verbosity level for the io framework (0 =
> no verbosity)
> MCA io: parameter "io_romio_priority" (current
> value: "10")
> Priority of the io romio component
> MCA io: parameter
> "io_romio_delete_priority" (current value: "10")
> Delete priority of the io romio component
> MCA io: parameter
> "io_romio_enable_parallel_optimizations" (current value: "0")
> Enable set of Open MPI-added options to
> improve collective file i/o performance
> MCA mpool: parameter "mpool" (current value: <none>)
> Default selection set of components for
> the mpool framework (<none> means "use all components that can be
> found")
> MCA mpool: parameter "mpool_base_verbose" (current
> value: "0")
> Verbosity level for the mpool framework
> (0 = no verbosity)
> MCA mpool: parameter
> "mpool_rdma_rcache_name" (current value: "vma")
> The name of the registration cache the
> mpool should use
> MCA mpool: parameter
> "mpool_rdma_rcache_size_limit" (current value: "0")
> the maximum size of registration cache in
> bytes. 0 is unlimited (default 0)
> MCA mpool: parameter
> "mpool_rdma_print_stats" (current value: "0")
> print pool usage statistics at the end of
> the run
> MCA mpool: parameter "mpool_rdma_priority" (current
> value: "0")
> MCA mpool: parameter "mpool_sm_allocator" (current
> value: "bucket")
> Name of allocator component to use with
> sm mpool
> MCA mpool: parameter "mpool_sm_max_size" (current
> value: "536870912")
> Maximum size of the sm mpool shared
> memory file
> MCA mpool: parameter "mpool_sm_min_size" (current
> value: "134217728")
> Minimum size of the sm mpool shared
> memory file
> MCA mpool: parameter
> "mpool_sm_per_peer_size" (current value: "33554432")
> Size (in bytes) to allocate per local
> peer in the sm mpool shared memory file, bounded by min_size and
> max_size
> MCA mpool: parameter "mpool_sm_priority" (current
> value: "0")
> MCA mpool: parameter
> "mpool_base_use_mem_hooks" (current value: "0")
> use memory hooks for deregistering freed
> memory
> MCA mpool: parameter "mpool_use_mem_hooks" (current
> value: "0")
> (deprecated, use mpool_base_use_mem_hooks)
> MCA mpool: parameter
> "mpool_base_disable_sbrk" (current value: "0")
> use mallopt to override calling sbrk
> (doesn't return memory to OS!)
> MCA mpool: parameter "mpool_disable_sbrk" (current
> value: "0")
> (deprecated, use
> mca_mpool_base_disable_sbrk)
> MCA pml: parameter "pml" (current value: <none>)
> Default selection set of components for
> the pml framework (<none> means "use all components that can be
> found")
> MCA pml: parameter "pml_base_verbose" (current
> value: "0")
> Verbosity level for the pml framework (0
> = no verbosity)
> MCA pml: parameter "pml_cm_free_list_num" (current
> value: "4")
> Initial size of request free lists
> MCA pml: parameter "pml_cm_free_list_max" (current
> value: "-1")
> Maximum size of request free lists
> MCA pml: parameter "pml_cm_free_list_inc" (current
> value: "64")
> Number of elements to add when growing
> request free lists
> MCA pml: parameter "pml_cm_priority" (current
> value: "30")
> CM PML selection priority
> MCA pml: parameter
> "pml_ob1_free_list_num" (current value: "4")
> MCA pml: parameter
> "pml_ob1_free_list_max" (current value: "-1")
> MCA pml: parameter
> "pml_ob1_free_list_inc" (current value: "64")
> MCA pml: parameter "pml_ob1_priority" (current
> value: "20")
> MCA pml: parameter "pml_ob1_eager_limit" (current
> value: "131072")
> MCA pml: parameter
> "pml_ob1_send_pipeline_depth" (current value: "3")
> MCA pml: parameter
> "pml_ob1_recv_pipeline_depth" (current value: "4")
> MCA bml: parameter "bml" (current value: <none>)
> Default selection set of components for
> the bml framework (<none> means "use all components that can be
> found")
> MCA bml: parameter "bml_base_verbose" (current
> value: "0")
> Verbosity level for the bml framework (0
> = no verbosity)
> MCA bml: parameter
> "bml_r2_show_unreach_errors" (current value: "1")
> Show error message when procs are
> unreachable
> MCA bml: parameter "bml_r2_priority" (current
> value: "0")
> MCA rcache: parameter "rcache" (current value: <none>)
> Default selection set of components for
> the rcache framework (<none> means "use all components that can be
> found")
> MCA rcache: parameter "rcache_base_verbose" (current
> value: "0")
> Verbosity level for the rcache framework
> (0 = no verbosity)
> MCA rcache: parameter "rcache_vma_priority" (current
> value: "0")
> MCA btl: parameter "btl_base_debug" (current
> value: "0")
> If btl_base_debug is 1 standard debug is
> output, if > 1 verbose debug is output
> MCA btl: parameter "btl" (current value: <none>)
> Default selection set of components for
> the btl framework (<none> means "use all components that can be
> found")
> MCA btl: parameter "btl_base_verbose" (current
> value: "0")
> Verbosity level for the btl framework (0
> = no verbosity)
> MCA btl: parameter
> "btl_self_free_list_num" (current value: "0")
> Number of fragments by default
> MCA btl: parameter
> "btl_self_free_list_max" (current value: "-1")
> Maximum number of fragments
> MCA btl: parameter
> "btl_self_free_list_inc" (current value: "32")
> Increment by this number of fragments
> MCA btl: parameter "btl_self_eager_limit" (current
> value: "131072")
> Eager size fragmeng (before the rendez-
> vous ptotocol)
> MCA btl: parameter
> "btl_self_min_send_size" (current value: "262144")
> Minimum fragment size after the rendez-vous
> MCA btl: parameter
> "btl_self_max_send_size" (current value: "262144")
> Maximum fragment size after the rendez-vous
> MCA btl: parameter
> "btl_self_min_rdma_size" (current value: "2147483647")
> Maximum fragment size for the RDMA transfer
> MCA btl: parameter
> "btl_self_max_rdma_size" (current value: "2147483647")
> Maximum fragment size for the RDMA transfer
> MCA btl: parameter "btl_self_exclusivity" (current
> value: "65536")
> Device exclusivity
> MCA btl: parameter "btl_self_flags" (current
> value: "10")
> Active behavior flags
> MCA btl: parameter "btl_self_priority" (current
> value: "0")
> MCA btl: parameter "btl_sm_free_list_num" (current
> value: "8")
> MCA btl: parameter "btl_sm_free_list_max" (current
> value: "-1")
> MCA btl: parameter "btl_sm_free_list_inc" (current
> value: "64")
> MCA btl: parameter "btl_sm_exclusivity" (current
> value: "65535")
> MCA btl: parameter "btl_sm_latency" (current
> value: "100")
> MCA btl: parameter "btl_sm_max_procs" (current
> value: "-1")
> MCA btl: parameter
> "btl_sm_sm_extra_procs" (current value: "2")
> MCA btl: parameter "btl_sm_mpool" (current value:
> "sm")
> MCA btl: parameter "btl_sm_eager_limit" (current
> value: "4096")
> MCA btl: parameter "btl_sm_max_frag_size" (current
> value: "32768")
> MCA btl: parameter
> "btl_sm_size_of_cb_queue" (current value: "128")
> MCA btl: parameter
> "btl_sm_cb_lazy_free_freq" (current value: "120")
> MCA btl: parameter "btl_sm_priority" (current
> value: "0")
> MCA btl: parameter "btl_tcp_if_include" (current
> value: <none>)
> MCA btl: parameter "btl_tcp_if_exclude" (current
> value: "lo")
> MCA btl: parameter
> "btl_tcp_free_list_num" (current value: "8")
> MCA btl: parameter
> "btl_tcp_free_list_max" (current value: "-1")
> MCA btl: parameter
> "btl_tcp_free_list_inc" (current value: "32")
> MCA btl: parameter "btl_tcp_sndbuf" (current
> value: "131072")
> MCA btl: parameter "btl_tcp_rcvbuf" (current
> value: "131072")
> MCA btl: parameter
> "btl_tcp_endpoint_cache" (current value: "30720")
> MCA btl: parameter "btl_tcp_exclusivity" (current
> value: "0")
> MCA btl: parameter "btl_tcp_eager_limit" (current
> value: "65536")
> MCA btl: parameter
> "btl_tcp_min_send_size" (current value: "65536")
> MCA btl: parameter
> "btl_tcp_max_send_size" (current value: "131072")
> MCA btl: parameter
> "btl_tcp_min_rdma_size" (current value: "131072")
> MCA btl: parameter
> "btl_tcp_max_rdma_size" (current value: "2147483647")
> MCA btl: parameter "btl_tcp_flags" (current value:
> "122")
> MCA btl: parameter "btl_tcp_priority" (current
> value: "0")
> MCA btl: parameter "btl_base_include" (current
> value: <none>)
> MCA btl: parameter "btl_base_exclude" (current
> value: <none>)
> MCA btl: parameter
> "btl_base_warn_component_unused" (current value: "1")
> This parameter is used to turn on warning
> messages when certain NICs are not used
> MCA mtl: parameter "mtl" (current value: <none>)
> Default selection set of components for
> the mtl framework (<none> means "use all components that can be
> found")
> MCA mtl: parameter "mtl_base_verbose" (current
> value: "0")
> Verbosity level for the mtl framework (0
> = no verbosity)
> MCA topo: parameter "topo" (current value: <none>)
> Default selection set of components for
> the topo framework (<none> means "use all components that can be
> found")
> MCA topo: parameter "topo_base_verbose" (current
> value: "0")
> Verbosity level for the topo framework (0
> = no verbosity)
> MCA osc: parameter "osc" (current value: <none>)
> Default selection set of components for
> the osc framework (<none> means "use all components that can be
> found")
> MCA osc: parameter "osc_base_verbose" (current
> value: "0")
> Verbosity level for the osc framework (0
> = no verbosity)
> MCA osc: parameter "osc_pt2pt_no_locks" (current
> value: "0")
> Enable optimizations available only if
> MPI_LOCK is not used.
> MCA osc: parameter
> "osc_pt2pt_eager_limit" (current value: "16384")
> Max size of eagerly sent data
> MCA osc: parameter "osc_pt2pt_priority" (current
> value: "0")
> MCA errmgr: parameter "errmgr" (current value: <none>)
> Default selection set of components for
> the errmgr framework (<none> means "use all components that can be
> found")
> MCA errmgr: parameter "errmgr_hnp_debug" (current
> value: "0")
> MCA errmgr: parameter "errmgr_hnp_priority" (current
> value: "0")
> MCA errmgr: parameter "errmgr_orted_debug" (current
> value: "0")
> MCA errmgr: parameter
> "errmgr_orted_priority" (current value: "0")
> MCA errmgr: parameter "errmgr_proxy_debug" (current
> value: "0")
> MCA errmgr: parameter
> "errmgr_proxy_priority" (current value: "0")
> MCA gpr: parameter "gpr_base_maxsize" (current
> value: "2147483647")
> MCA gpr: parameter "gpr_base_blocksize" (current
> value: "512")
> MCA gpr: parameter "gpr" (current value: <none>)
> Default selection set of components for
> the gpr framework (<none> means "use all components that can be
> found")
> MCA gpr: parameter "gpr_null_priority" (current
> value: "0")
> MCA gpr: parameter "gpr_proxy_debug" (current
> value: "0")
> MCA gpr: parameter "gpr_proxy_priority" (current
> value: "0")
> MCA gpr: parameter "gpr_replica_debug" (current
> value: "0")
> MCA gpr: parameter "gpr_replica_isolate" (current
> value: "0")
> MCA gpr: parameter "gpr_replica_priority" (current
> value: "0")
> MCA iof: parameter "iof_base_window_size" (current
> value: "4096")
> MCA iof: parameter "iof_base_service" (current
> value: "0.0.0")
> MCA iof: parameter "iof" (current value: <none>)
> Default selection set of components for
> the iof framework (<none> means "use all components that can be
> found")
> MCA iof: parameter "iof_proxy_debug" (current
> value: "1")
> MCA iof: parameter "iof_proxy_priority" (current
> value: "0")
> MCA iof: parameter "iof_svc_debug" (current value:
> "1")
> MCA iof: parameter "iof_svc_priority" (current
> value: "0")
> MCA ns: parameter "ns" (current value: <none>)
> Default selection set of components for
> the ns framework (<none> means "use all components that can be found")
> MCA ns: parameter "ns_proxy_debug" (current
> value: "0")
> MCA ns: parameter "ns_proxy_maxsize" (current
> value: "2147483647")
> MCA ns: parameter "ns_proxy_blocksize" (current
> value: "512")
> MCA ns: parameter "ns_proxy_priority" (current
> value: "0")
> MCA ns: parameter "ns_replica_debug" (current
> value: "0")
> MCA ns: parameter "ns_replica_isolate" (current
> value: "0")
> MCA ns: parameter "ns_replica_maxsize" (current
> value: "2147483647")
> MCA ns: parameter "ns_replica_blocksize" (current
> value: "512")
> MCA ns: parameter "ns_replica_priority" (current
> value: "0")
> MCA oob: parameter "oob" (current value: <none>)
> Default selection set of components for
> the oob framework (<none> means "use all components that can be
> found")
> MCA oob: parameter "oob_base_verbose" (current
> value: "0")
> Verbosity level for the oob framework (0
> = no verbosity)
> MCA oob: parameter "oob_tcp_peer_limit" (current
> value: "-1")
> MCA oob: parameter "oob_tcp_peer_retries" (current
> value: "60")
> MCA oob: parameter "oob_tcp_debug" (current value:
> "0")
> MCA oob: parameter "oob_tcp_sndbuf" (current
> value: "131072")
> MCA oob: parameter "oob_tcp_rcvbuf" (current
> value: "131072")
> MCA oob: parameter "oob_tcp_if_include" (current
> value: <none>)
> Comma-delimited list of TCP interfaces to
> use
> MCA oob: parameter "oob_tcp_if_exclude" (current
> value: <none>)
> Comma-delimited list of TCP interfaces to
> exclude
> MCA oob: parameter
> "oob_tcp_connect_sleep" (current value: "1")
> Enable (1) / disable (0) random sleep for
> connection wireup
> MCA oob: parameter "oob_tcp_listen_mode" (current
> value: "event")
> Mode for HNP to accept incoming
> connections: event, listen_thread
> MCA oob: parameter
> "oob_tcp_listen_thread_max_queue" (current value: "10")
> High water mark for queued accepted
> socket list size
> MCA oob: parameter
> "oob_tcp_listen_thread_max_time" (current value: "10")
> Maximum amount of time (in milliseconds)
> to wait between processing accepted socket list
> MCA oob: parameter
> "oob_tcp_accept_spin_count" (current value: "10")
> Number of times to let accept return
> EWOULDBLOCK before updating accepted socket list
> MCA oob: parameter "oob_tcp_priority" (current
> value: "0")
> MCA ras: parameter "ras" (current value: <none>)
> MCA ras: parameter
> "ras_dash_host_priority" (current value: "5")
> Selection priority for the dash_host RAS
> component
> MCA ras: parameter "ras_gridengine_debug" (current
> value: "0")
> Enable debugging output for the
> gridengine ras component
> MCA ras: parameter
> "ras_gridengine_priority" (current value: "100")
> Priority of the gridengine ras component
> MCA ras: parameter
> "ras_gridengine_verbose" (current value: "0")
> Enable verbose output for the gridengine
> ras component
> MCA ras: parameter
> "ras_gridengine_show_jobid" (current value: "0")
> Show the JOB_ID of the Grid Engine job
> MCA ras: parameter
> "ras_localhost_priority" (current value: "0")
> Selection priority for the localhost RAS
> component
> MCA ras: parameter "ras_slurm_priority" (current
> value: "75")
> Priority of the slurm ras component
> MCA rds: parameter "rds" (current value: <none>)
> MCA rds: parameter "rds_hostfile_debug" (current
> value: "0")
> Toggle debug output for hostfile RDS
> component
> MCA rds: parameter "rds_hostfile_path" (current
> value: "/path/to/openmpi/etc/openmpi-default-hostfile")
> ORTE Host filename
> MCA rds: parameter
> "rds_hostfile_priority" (current value: "0")
> MCA rds: parameter "rds_proxy_priority" (current
> value: "0")
> MCA rds: parameter "rds_resfile_debug" (current
> value: "0")
> Toggle debug output for resfile RDS
> component
> MCA rds: parameter "rds_resfile_name" (current
> value: <none>)
> ORTE Resource filename
> MCA rds: parameter "rds_resfile_priority" (current
> value: "0")
> MCA rmaps: parameter "rmaps_base_verbose" (current
> value: "0")
> Verbosity level for the rmaps framework
> MCA rmaps: parameter
> "rmaps_base_schedule_policy" (current value: "unspec")
> Scheduling Policy for RMAPS. [slot | node]
> MCA rmaps: parameter "rmaps_base_pernode" (current
> value: "0")
> Launch one ppn as directed
> MCA rmaps: parameter "rmaps_base_n_pernode" (current
> value: "-1")
> Launch n procs/node
> MCA rmaps: parameter
> "rmaps_base_no_schedule_local" (current value: "0")
> If false, allow scheduling MPI
> applications on the same node as mpirun (default). If true, do not
> schedule any MPI applications on the same node as mpirun
> MCA rmaps: parameter
> "rmaps_base_no_oversubscribe" (current value: "0")
> If true, then do not allow
> oversubscription of nodes - mpirun will return an error if there
> aren't enough nodes to launch all processes without oversubscribing
> MCA rmaps: parameter "rmaps" (current value: <none>)
> Default selection set of components for
> the rmaps framework (<none> means "use all components that can be
> found")
> MCA rmaps: parameter
> "rmaps_round_robin_debug" (current value: "1")
> Toggle debug output for Round Robin RMAPS
> component
> MCA rmaps: parameter
> "rmaps_round_robin_priority" (current value: "1")
> Selection priority for Round Robin RMAPS
> component
> MCA rmgr: parameter "rmgr" (current value: <none>)
> Default selection set of components for
> the rmgr framework (<none> means "use all components that can be
> found")
> MCA rmgr: parameter "rmgr_proxy_priority" (current
> value: "0")
> MCA rmgr: parameter "rmgr_urm_priority" (current
> value: "0")
> MCA rml: parameter "rml" (current value: <none>)
> Default selection set of components for
> the rml framework (<none> means "use all components that can be
> found")
> MCA rml: parameter "rml_base_verbose" (current
> value: "0")
> Verbosity level for the rml framework (0
> = no verbosity)
> MCA rml: parameter "rml_oob_priority" (current
> value: "0")
> MCA pls: parameter
> "pls_base_reuse_daemons" (current value: "0")
> If nonzero, reuse daemons to launch
> dynamically spawned processes. If zero, do not reuse daemons
> (default)
> MCA pls: parameter "pls" (current value: <none>)
> Default selection set of components for
> the pls framework (<none> means "use all components that can be
> found")
> MCA pls: parameter "pls_base_verbose" (current
> value: "0")
> Verbosity level for the pls framework (0
> = no verbosity)
> MCA pls: parameter "pls_gridengine_debug" (current
> value: "0")
> Enable debugging of gridengine pls component
> MCA pls: parameter
> "pls_gridengine_verbose" (current value: "0")
> Enable verbose output of the gridengine
> qrsh -inherit command
> MCA pls: parameter
> "pls_gridengine_priority" (current value: "100")
> Priority of the gridengine pls component
> MCA pls: parameter "pls_gridengine_orted" (current
> value: "orted")
> The command name that the gridengine pls
> component will invoke for the ORTE daemon
> MCA pls: parameter "pls_proxy_priority" (current
> value: "0")
> MCA pls: parameter "pls_rsh_debug" (current value:
> "0")
> Whether or not to enable debugging output
> for the rsh pls component (0 or 1)
> MCA pls: parameter
> "pls_rsh_num_concurrent" (current value: "128")
> How many pls_rsh_agent instances to
> invoke concurrently (must be > 0)
> MCA pls: parameter "pls_rsh_force_rsh" (current
> value: "0")
> Force the launcher to always use rsh,
> even for local daemons
> MCA pls: parameter "pls_rsh_orted" (current value:
> "orted")
> The command name that the rsh pls
> component will invoke for the ORTE daemon
> MCA pls: parameter "pls_rsh_priority" (current
> value: "10")
> Priority of the rsh pls component
> MCA pls: parameter "pls_rsh_delay" (current value:
> "1")
> Delay (in seconds) between invocations of
> the remote agent, but only used when the "debug" MCA parameter is
> true, or the top-level MCA debugging is enabled (otherwise this
> value is ignored)
> MCA pls: parameter "pls_rsh_reap" (current value:
> "1")
> If set to 1, wait for all the processes
> to complete before exiting. Otherwise, quit immediately -- without
> waiting for confirmation that all other processes in the job have
> completed.
> MCA pls: parameter
> "pls_rsh_assume_same_shell" (current value: "1")
> If set to 1, assume that the shell on the
> remote node is the same as the shell on the local node. Otherwise,
> probe for what the remote shell.
> MCA pls: parameter "pls_rsh_agent" (current value:
> "ssh : rsh")
> The command used to launch executables on
> remote nodes (typically either "ssh" or "rsh")
> MCA pls: parameter "pls_slurm_debug" (current
> value: "0")
> Enable debugging of slurm pls
> MCA pls: parameter "pls_slurm_priority" (current
> value: "75")
> Default selection priority
> MCA pls: parameter "pls_slurm_orted" (current
> value: "orted")
> Command to use to start proxy orted
> MCA pls: parameter "pls_slurm_args" (current
> value: <none>)
> Custom arguments to srun
> MCA sds: parameter "sds" (current value: <none>)
> Default selection set of components for
> the sds framework (<none> means "use all components that can be
> found")
> MCA sds: parameter "sds_base_verbose" (current
> value: "0")
> Verbosity level for the sds framework (0
> = no verbosity)
> MCA sds: parameter "sds_env_priority" (current
> value: "0")
> MCA sds: parameter "sds_pipe_priority" (current
> value: "0")
> MCA sds: parameter "sds_seed_priority" (current
> value: "0")
> MCA sds: parameter
> "sds_singleton_priority" (current value: "0")
> MCA sds: parameter "sds_slurm_priority" (current
> value: "0")
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems