Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SM component init unload
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-07-03 20:05:16


Okay, please try this again with r26739 or above. You can remove the rest
of the "verbose" settings and the --display-map so we declutter the output.
Please add "-mca orte_nidmap_verbose 20" to your cmd line.

Thanks!
Ralph

On Tue, Jul 3, 2012 at 1:50 PM, Juan A. Rico <jarico_at_[hidden]> wrote:

> Here is the output.
>
> [jarico_at_Metropolis-01 examples]$
> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core
> --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca
> coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca
> mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2
> -mca grpcomm_base_verbose 5 ./bmem
> [Metropolis-01:24563] mca: base: components_open: Looking for hwloc
> components
> [Metropolis-01:24563] mca: base: components_open: opening hwloc components
> [Metropolis-01:24563] mca: base: components_open: found loaded component
> hwloc142
> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has
> no register function
> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has
> no open function
> [Metropolis-01:24563] hwloc:base:get_topology
> [Metropolis-01:24563] hwloc:base: no cpus specified - using root available
> cpuset
> [Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad]
> set priority to 10
> [Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm
> --------------------------------------------------------------------------
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
> Node: Metropolis-01
>
> This is a warning only; your job will continue, though performance may
> be degraded.
> --------------------------------------------------------------------------
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>
> ======================== JOB MAP ========================
>
> Data for node: Metropolis-01 Num procs: 2
> Process OMPI jobid: [36265,1] App: 0 Process rank: 0
> Process OMPI jobid: [36265,1] App: 0 Process rank: 1
>
> =============================================================
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,0] tag 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon
> nidmap
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24564] mca: base: components_open: Looking for hwloc
> components
> [Metropolis-01:24564] mca: base: components_open: opening hwloc components
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> hwloc142
> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has
> no register function
> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has
> no open function
> [Metropolis-01:24565] mca: base: components_open: Looking for hwloc
> components
> [Metropolis-01:24565] mca: base: components_open: opening hwloc components
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> hwloc142
> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has
> no register function
> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has
> no open function
> [Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad]
> set priority to 10
> [Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm
> [Metropolis-01:24564] computing locality - getting object at level CORE,
> index 0
> [Metropolis-01:24564] hwloc:base: get available cpus
> [Metropolis-01:24564] hwloc:base:get_available_cpus first time - filtering
> cpus
> [Metropolis-01:24564] hwloc:base: no cpus specified - using root available
> cpuset
> [Metropolis-01:24564] computing locality - getting object at level CORE,
> index 1
> [Metropolis-01:24564] hwloc:base: get available cpus
> [Metropolis-01:24564] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24564] computing locality - shifting up from L1CACHE
> [Metropolis-01:24564] computing locality - shifting up from L2CACHE
> [Metropolis-01:24564] computing locality - shifting up from L3CACHE
> [Metropolis-01:24564] computing locality - filling level SOCKET
> [Metropolis-01:24564] computing locality - filling level NUMA
> [Metropolis-01:24564] locality: CL:CU:N:B:Nu:S
> [Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad]
> set priority to 10
> [Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm
> [Metropolis-01:24564] mca: base: components_open: Looking for coll
> components
> [Metropolis-01:24564] mca: base: components_open: opening coll components
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> tuned
> [Metropolis-01:24564] mca: base: components_open: component tuned has no
> register function
> [Metropolis-01:24564] coll:tuned:component_open: done!
> [Metropolis-01:24564] mca: base: components_open: component tuned open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component sm
> [Metropolis-01:24564] mca: base: components_open: component sm register
> function successful
> [Metropolis-01:24564] mca: base: components_open: component sm has no open
> function
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> libnbc
> [Metropolis-01:24564] mca: base: components_open: component libnbc
> register function successful
> [Metropolis-01:24564] mca: base: components_open: component libnbc open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> hierarch
> [Metropolis-01:24564] mca: base: components_open: component hierarch has
> no register function
> [Metropolis-01:24564] mca: base: components_open: component hierarch open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> basic
> [Metropolis-01:24564] mca: base: components_open: component basic register
> function successful
> [Metropolis-01:24564] mca: base: components_open: component basic has no
> open function
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> inter
> [Metropolis-01:24564] mca: base: components_open: component inter has no
> register function
> [Metropolis-01:24564] mca: base: components_open: component inter open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> self
> [Metropolis-01:24564] mca: base: components_open: component self has no
> register function
> [Metropolis-01:24564] mca: base: components_open: component self open
> function successful
> [Metropolis-01:24565] computing locality - getting object at level CORE,
> index 1
> [Metropolis-01:24565] hwloc:base: get available cpus
> [Metropolis-01:24565] hwloc:base:get_available_cpus first time - filtering
> cpus
> [Metropolis-01:24565] hwloc:base: no cpus specified - using root available
> cpuset
> [Metropolis-01:24565] hwloc:base: get available cpus
> [Metropolis-01:24565] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24565] computing locality - getting object at level CORE,
> index 0
> [Metropolis-01:24565] computing locality - shifting up from L1CACHE
> [Metropolis-01:24565] computing locality - shifting up from L2CACHE
> [Metropolis-01:24565] computing locality - shifting up from L3CACHE
> [Metropolis-01:24565] computing locality - filling level SOCKET
> [Metropolis-01:24565] computing locality - filling level NUMA
> [Metropolis-01:24565] locality: CL:CU:N:B:Nu:S
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4
> entries
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing
> allgather
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted
> [Metropolis-01:24565] mca: base: components_open: Looking for coll
> components
> [Metropolis-01:24565] mca: base: components_open: opening coll components
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> tuned
> [Metropolis-01:24565] mca: base: components_open: component tuned has no
> register function
> [Metropolis-01:24565] coll:tuned:component_open: done!
> [Metropolis-01:24565] mca: base: components_open: component tuned open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component sm
> [Metropolis-01:24565] mca: base: components_open: component sm register
> function successful
> [Metropolis-01:24565] mca: base: components_open: component sm has no open
> function
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> libnbc
> [Metropolis-01:24565] mca: base: components_open: component libnbc
> register function successful
> [Metropolis-01:24565] mca: base: components_open: component libnbc open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> hierarch
> [Metropolis-01:24565] mca: base: components_open: component hierarch has
> no register function
> [Metropolis-01:24565] mca: base: components_open: component hierarch open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> basic
> [Metropolis-01:24565] mca: base: components_open: component basic register
> function successful
> [Metropolis-01:24565] mca: base: components_open: component basic has no
> open function
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> inter
> [Metropolis-01:24565] mca: base: components_open: component inter has no
> register function
> [Metropolis-01:24565] mca: base: components_open: component inter open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> self
> [Metropolis-01:24565] mca: base: components_open: component self has no
> register function
> [Metropolis-01:24565] mca: base: components_open: component self open
> function successful
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE -
> SENDING TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING
> COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM
> CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,1] tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4
> entries
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing
> allgather
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing
> collective return for id 0
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0
> [Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],0]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing
> collective return for id 0
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0
> [Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],0]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],0]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],1]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],1]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],0]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],1]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],1]
> [Metropolis-01:24564] coll:find_available: querying coll component tuned
> [Metropolis-01:24564] coll:find_available: coll component tuned is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component tuned
> [Metropolis-01:24565] coll:find_available: coll component tuned is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component sm
> [Metropolis-01:24564] coll:find_available: querying coll component sm
> [Metropolis-01:24564] coll:sm:init_query: no other local procs;
> disqualifying myself
> [Metropolis-01:24564] coll:find_available: coll component sm is not
> available
> [Metropolis-01:24564] coll:find_available: querying coll component libnbc
> [Metropolis-01:24564] coll:find_available: coll component libnbc is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component hierarch
> [Metropolis-01:24564] coll:find_available: coll component hierarch is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component basic
> [Metropolis-01:24564] coll:find_available: coll component basic is
> available
> [Metropolis-01:24565] coll:sm:init_query: no other local procs;
> disqualifying myself
> [Metropolis-01:24565] coll:find_available: coll component sm is not
> available
> [Metropolis-01:24565] coll:find_available: querying coll component libnbc
> [Metropolis-01:24565] coll:find_available: coll component libnbc is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component hierarch
> [Metropolis-01:24565] coll:find_available: coll component hierarch is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component basic
> [Metropolis-01:24565] coll:find_available: coll component basic is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component inter
> [Metropolis-01:24564] coll:find_available: coll component inter is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component self
> [Metropolis-01:24564] coll:find_available: coll component self is available
> [Metropolis-01:24565] coll:find_available: querying coll component inter
> [Metropolis-01:24565] coll:find_available: coll component inter is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component self
> [Metropolis-01:24565] coll:find_available: coll component self is available
> [Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
> [Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE -
> SENDING TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING
> COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM
> CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,1] tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing
> collective return for id 1
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing
> collective return for id 1
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1
> [Metropolis-01:24565] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> [Metropolis-01:24565] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24565] coll:tuned:module_tuned query called
> [Metropolis-01:24565] coll:base:comm_select: component available: tuned,
> priority: 30
> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24565] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: inter
> [Metropolis-01:24565] coll:base:comm_select: component not available: self
> [Metropolis-01:24565] coll:tuned:module_init called.
> [Metropolis-01:24565] coll:tuned:module_init Tuned is in use
> [Metropolis-01:24565] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> [Metropolis-01:24565] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> [Metropolis-01:24564] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:tuned:module_tuned query called
> [Metropolis-01:24564] coll:base:comm_select: component available: tuned,
> priority: 30
> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24564] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: inter
> [Metropolis-01:24564] coll:base:comm_select: component not available: self
> [Metropolis-01:24564] coll:tuned:module_init called.
> [Metropolis-01:24565] coll:tuned:module_tuned query called
> [Metropolis-01:24565] coll:base:comm_select: component not available: tuned
> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24565] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: inter
> [Metropolis-01:24565] coll:base:comm_select: component available: self,
> priority: 75
> [Metropolis-01:24564] coll:tuned:module_init Tuned is in use
> [Metropolis-01:24564] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> [Metropolis-01:24564] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:tuned:module_tuned query called
> [Metropolis-01:24564] coll:base:comm_select: component not available: tuned
> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24564] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: inter
> [Metropolis-01:24564] coll:base:comm_select: component available: self,
> priority: 75
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE -
> SENDING TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING
> COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM
> CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,1] tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing
> collective return for id 2
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing
> collective return for id 2
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2
> [Metropolis-01:24565] coll:tuned:component_close: called
> [Metropolis-01:24565] coll:tuned:component_close: done!
> [Metropolis-01:24565] mca: base: close: component tuned closed
> [Metropolis-01:24565] mca: base: close: unloading component tuned
> [Metropolis-01:24565] mca: base: close: component libnbc closed
> [Metropolis-01:24565] mca: base: close: unloading component libnbc
> [Metropolis-01:24565] mca: base: close: unloading component hierarch
> [Metropolis-01:24565] mca: base: close: unloading component basic
> [Metropolis-01:24565] mca: base: close: unloading component inter
> [Metropolis-01:24565] mca: base: close: unloading component self
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm
> [Metropolis-01:24564] coll:tuned:component_close: called
> [Metropolis-01:24564] coll:tuned:component_close: done!
> [Metropolis-01:24564] mca: base: close: component tuned closed
> [Metropolis-01:24564] mca: base: close: unloading component tuned
> [Metropolis-01:24564] mca: base: close: component libnbc closed
> [Metropolis-01:24564] mca: base: close: unloading component libnbc
> [Metropolis-01:24564] mca: base: close: unloading component hierarch
> [Metropolis-01:24564] mca: base: close: unloading component basic
> [Metropolis-01:24564] mca: base: close: unloading component inter
> [Metropolis-01:24564] mca: base: close: unloading component self
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,0] tag 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [jarico_at_Metropolis-01 examples]$
>
>
>
> El 03/07/2012, a las 21:44, Ralph Castain escribió:
>
> > Interesting - yes, coll sm doesn't think they are on the same node for
> some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why
> >
> >
> > On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote:
> >
> >> The code I run is a simple broadcast.
> >>
> >> When I do not specify components to run, the output is (more verbose):
> >>
> >> [jarico_at_Metropolis-01 examples]$
> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca
> mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority
> 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca
> mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem
> >> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc
> components
> >> [Metropolis-01:24490] mca: base: components_open: opening hwloc
> components
> >> [Metropolis-01:24490] mca: base: components_open: found loaded
> component hwloc142
> >> [Metropolis-01:24490] mca: base: components_open: component hwloc142
> has no register function
> >> [Metropolis-01:24490] mca: base: components_open: component hwloc142
> has no open function
> >> [Metropolis-01:24490] hwloc:base:get_topology
> >> [Metropolis-01:24490] hwloc:base: no cpus specified - using root
> available cpuset
> >>
> >> ======================== JOB MAP ========================
> >>
> >> Data for node: Metropolis-01 Num procs: 2
> >> Process OMPI jobid: [36336,1] App: 0 Process rank: 0
> >> Process OMPI jobid: [36336,1] App: 0 Process rank: 1
> >>
> >> =============================================================
> >> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc
> components
> >> [Metropolis-01:24491] mca: base: components_open: opening hwloc
> components
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component hwloc142
> >> [Metropolis-01:24491] mca: base: components_open: component hwloc142
> has no register function
> >> [Metropolis-01:24491] mca: base: components_open: component hwloc142
> has no open function
> >> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc
> components
> >> [Metropolis-01:24492] mca: base: components_open: opening hwloc
> components
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component hwloc142
> >> [Metropolis-01:24492] mca: base: components_open: component hwloc142
> has no register function
> >> [Metropolis-01:24492] mca: base: components_open: component hwloc142
> has no open function
> >> [Metropolis-01:24491] locality: CL:CU:N:B
> >> [Metropolis-01:24491] hwloc:base: get available cpus
> >> [Metropolis-01:24491] hwloc:base:get_available_cpus first time -
> filtering cpus
> >> [Metropolis-01:24491] hwloc:base: no cpus specified - using root
> available cpuset
> >> [Metropolis-01:24491] hwloc:base:get_available_cpus root object
> >> [Metropolis-01:24491] mca: base: components_open: Looking for coll
> components
> >> [Metropolis-01:24491] mca: base: components_open: opening coll
> components
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component tuned
> >> [Metropolis-01:24491] mca: base: components_open: component tuned has
> no register function
> >> [Metropolis-01:24491] coll:tuned:component_open: done!
> >> [Metropolis-01:24491] mca: base: components_open: component tuned open
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component sm
> >> [Metropolis-01:24491] mca: base: components_open: component sm register
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: component sm has no
> open function
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component libnbc
> >> [Metropolis-01:24491] mca: base: components_open: component libnbc
> register function successful
> >> [Metropolis-01:24491] mca: base: components_open: component libnbc open
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component hierarch
> >> [Metropolis-01:24491] mca: base: components_open: component hierarch
> has no register function
> >> [Metropolis-01:24491] mca: base: components_open: component hierarch
> open function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component basic
> >> [Metropolis-01:24491] mca: base: components_open: component basic
> register function successful
> >> [Metropolis-01:24491] mca: base: components_open: component basic has
> no open function
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component inter
> >> [Metropolis-01:24491] mca: base: components_open: component inter has
> no register function
> >> [Metropolis-01:24491] mca: base: components_open: component inter open
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component self
> >> [Metropolis-01:24491] mca: base: components_open: component self has no
> register function
> >> [Metropolis-01:24491] mca: base: components_open: component self open
> function successful
> >> [Metropolis-01:24492] locality: CL:CU:N:B
> >> [Metropolis-01:24492] hwloc:base: get available cpus
> >> [Metropolis-01:24492] hwloc:base:get_available_cpus first time -
> filtering cpus
> >> [Metropolis-01:24492] hwloc:base: no cpus specified - using root
> available cpuset
> >> [Metropolis-01:24492] hwloc:base:get_available_cpus root object
> >> [Metropolis-01:24492] mca: base: components_open: Looking for coll
> components
> >> [Metropolis-01:24492] mca: base: components_open: opening coll
> components
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component tuned
> >> [Metropolis-01:24492] mca: base: components_open: component tuned has
> no register function
> >> [Metropolis-01:24492] coll:tuned:component_open: done!
> >> [Metropolis-01:24492] mca: base: components_open: component tuned open
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component sm
> >> [Metropolis-01:24492] mca: base: components_open: component sm register
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: component sm has no
> open function
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component libnbc
> >> [Metropolis-01:24492] mca: base: components_open: component libnbc
> register function successful
> >> [Metropolis-01:24492] mca: base: components_open: component libnbc open
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component hierarch
> >> [Metropolis-01:24492] mca: base: components_open: component hierarch
> has no register function
> >> [Metropolis-01:24492] mca: base: components_open: component hierarch
> open function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component basic
> >> [Metropolis-01:24492] mca: base: components_open: component basic
> register function successful
> >> [Metropolis-01:24492] mca: base: components_open: component basic has
> no open function
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component inter
> >> [Metropolis-01:24492] mca: base: components_open: component inter has
> no register function
> >> [Metropolis-01:24492] mca: base: components_open: component inter open
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component self
> >> [Metropolis-01:24492] mca: base: components_open: component self has no
> register function
> >> [Metropolis-01:24492] mca: base: components_open: component self open
> function successful
> >> [Metropolis-01:24491] coll:find_available: querying coll component tuned
> >> [Metropolis-01:24491] coll:find_available: coll component tuned is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component sm
> >> [Metropolis-01:24491] coll:sm:init_query: no other local procs;
> disqualifying myself
> >> [Metropolis-01:24491] coll:find_available: coll component sm is not
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component
> libnbc
> >> [Metropolis-01:24491] coll:find_available: coll component libnbc is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component
> hierarch
> >> [Metropolis-01:24491] coll:find_available: coll component hierarch is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component basic
> >> [Metropolis-01:24491] coll:find_available: coll component basic is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component inter
> >> [Metropolis-01:24492] coll:find_available: querying coll component tuned
> >> [Metropolis-01:24492] coll:find_available: coll component tuned is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component sm
> >> [Metropolis-01:24492] coll:sm:init_query: no other local procs;
> disqualifying myself
> >> [Metropolis-01:24492] coll:find_available: coll component sm is not
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component
> libnbc
> >> [Metropolis-01:24492] coll:find_available: coll component libnbc is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component
> hierarch
> >> [Metropolis-01:24492] coll:find_available: coll component hierarch is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component basic
> >> [Metropolis-01:24492] coll:find_available: coll component basic is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component inter
> >> [Metropolis-01:24492] coll:find_available: coll component inter is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component self
> >> [Metropolis-01:24492] coll:find_available: coll component self is
> available
> >> [Metropolis-01:24491] coll:find_available: coll component inter is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component self
> >> [Metropolis-01:24491] coll:find_available: coll component self is
> available
> >> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >> [Metropolis-01:24491] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> >> [Metropolis-01:24491] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24491] coll:tuned:module_tuned query called
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> tuned, priority: 30
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> self
> >> [Metropolis-01:24491] coll:tuned:module_init called.
> >> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use
> >> [Metropolis-01:24491] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> >> [Metropolis-01:24491] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24491] coll:tuned:module_tuned query called
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> tuned
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24491] coll:base:comm_select: component available: self,
> priority: 75
> >> [Metropolis-01:24492] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> >> [Metropolis-01:24492] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24492] coll:tuned:module_tuned query called
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> tuned, priority: 30
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> self
> >> [Metropolis-01:24492] coll:tuned:module_init called.
> >> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use
> >> [Metropolis-01:24492] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> >> [Metropolis-01:24492] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24492] coll:tuned:module_tuned query called
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> tuned
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24492] coll:base:comm_select: component available: self,
> priority: 75
> >> [Metropolis-01:24491] coll:tuned:component_close: called
> >> [Metropolis-01:24491] coll:tuned:component_close: done!
> >> [Metropolis-01:24492] coll:tuned:component_close: called
> >> [Metropolis-01:24492] coll:tuned:component_close: done!
> >> [Metropolis-01:24492] mca: base: close: component tuned closed
> >> [Metropolis-01:24492] mca: base: close: unloading component tuned
> >> [Metropolis-01:24492] mca: base: close: component libnbc closed
> >> [Metropolis-01:24492] mca: base: close: unloading component libnbc
> >> [Metropolis-01:24492] mca: base: close: unloading component hierarch
> >> [Metropolis-01:24492] mca: base: close: unloading component basic
> >> [Metropolis-01:24492] mca: base: close: unloading component inter
> >> [Metropolis-01:24492] mca: base: close: unloading component self
> >> [Metropolis-01:24491] mca: base: close: component tuned closed
> >> [Metropolis-01:24491] mca: base: close: unloading component tuned
> >> [Metropolis-01:24491] mca: base: close: component libnbc closed
> >> [Metropolis-01:24491] mca: base: close: unloading component libnbc
> >> [Metropolis-01:24491] mca: base: close: unloading component hierarch
> >> [Metropolis-01:24491] mca: base: close: unloading component basic
> >> [Metropolis-01:24491] mca: base: close: unloading component inter
> >> [Metropolis-01:24491] mca: base: close: unloading component self
> >> [jarico_at_Metropolis-01 examples]$
> >>
> >>
> >> SM is not load because it detects no other processes in the same
> machine:
> >>
> >> [Metropolis-01:24491] coll:sm:init_query: no other local procs;
> disqualifying myself
> >>
> >> The machine is a multicore machine with 8 cores.
> >>
> >> I need to run SM component code, and I suppose that raising priority it
> will be the component selected when problem is solved.
> >>
> >>
> >>
> >> El 03/07/2012, a las 21:01, Jeff Squyres escribió:
> >>
> >>> The issue is that the "sm" coll component only implements a few of the
> MPI collective operations. It is usually mixed at run-time with other coll
> components to fill out the rest of the MPI collective operations.
> >>>
> >>> So what is happening is that OMPI is determining that it doesn't have
> implementations of all the MPI collective operations and aborting.
> >>>
> >>> You shouldn't need to manually select your coll module -- OMPI should
> automatically select the right collective module for you. E.g., if all
> procs are local on a single machine and sm has a matching implementation
> for that MPI collective operation, it'll be used.
> >>>
> >>>
> >>>
> >>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote:
> >>>
> >>>> Output is:
> >>>>
> >>>> [Metropolis-01:15355] hwloc:base:get_topology
> >>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root
> available cpuset
> >>>>
> >>>> ======================== JOB MAP ========================
> >>>>
> >>>> Data for node: Metropolis-01 Num procs: 2
> >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 0
> >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 1
> >>>>
> >>>> =============================================================
> >>>> [Metropolis-01:15356] locality: CL:CU:N:B
> >>>> [Metropolis-01:15356] hwloc:base: get available cpus
> >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time -
> filtering cpus
> >>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root
> available cpuset
> >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object
> >>>> [Metropolis-01:15357] locality: CL:CU:N:B
> >>>> [Metropolis-01:15357] hwloc:base: get available cpus
> >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time -
> filtering cpus
> >>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root
> available cpuset
> >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object
> >>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >>>>
> >>>>
> >>>> Regards,
> >>>> Juan A. Rico
> >>>> _______________________________________________
> >>>> devel mailing list
> >>>> devel_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>>
> >>>
> >>> --
> >>> Jeff Squyres
> >>> jsquyres_at_[hidden]
> >>> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>