Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] SM component init unload
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-07-03 20:05:16


Okay, please try this again with r26739 or above. You can remove the rest
of the "verbose" settings and the --display-map so we declutter the output.
Please add "-mca orte_nidmap_verbose 20" to your cmd line.

Thanks!
Ralph

On Tue, Jul 3, 2012 at 1:50 PM, Juan A. Rico <jarico_at_[hidden]> wrote:

> Here is the output.
>
> [jarico_at_Metropolis-01 examples]$
> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core
> --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca
> coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca
> mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2
> -mca grpcomm_base_verbose 5 ./bmem
> [Metropolis-01:24563] mca: base: components_open: Looking for hwloc
> components
> [Metropolis-01:24563] mca: base: components_open: opening hwloc components
> [Metropolis-01:24563] mca: base: components_open: found loaded component
> hwloc142
> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has
> no register function
> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has
> no open function
> [Metropolis-01:24563] hwloc:base:get_topology
> [Metropolis-01:24563] hwloc:base: no cpus specified - using root available
> cpuset
> [Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad]
> set priority to 10
> [Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm
> --------------------------------------------------------------------------
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
> Node: Metropolis-01
>
> This is a warning only; your job will continue, though performance may
> be degraded.
> --------------------------------------------------------------------------
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>
> ======================== JOB MAP ========================
>
> Data for node: Metropolis-01 Num procs: 2
> Process OMPI jobid: [36265,1] App: 0 Process rank: 0
> Process OMPI jobid: [36265,1] App: 0 Process rank: 1
>
> =============================================================
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,0] tag 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon
> nidmap
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24564] mca: base: components_open: Looking for hwloc
> components
> [Metropolis-01:24564] mca: base: components_open: opening hwloc components
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> hwloc142
> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has
> no register function
> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has
> no open function
> [Metropolis-01:24565] mca: base: components_open: Looking for hwloc
> components
> [Metropolis-01:24565] mca: base: components_open: opening hwloc components
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> hwloc142
> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has
> no register function
> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has
> no open function
> [Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad]
> set priority to 10
> [Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm
> [Metropolis-01:24564] computing locality - getting object at level CORE,
> index 0
> [Metropolis-01:24564] hwloc:base: get available cpus
> [Metropolis-01:24564] hwloc:base:get_available_cpus first time - filtering
> cpus
> [Metropolis-01:24564] hwloc:base: no cpus specified - using root available
> cpuset
> [Metropolis-01:24564] computing locality - getting object at level CORE,
> index 1
> [Metropolis-01:24564] hwloc:base: get available cpus
> [Metropolis-01:24564] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24564] computing locality - shifting up from L1CACHE
> [Metropolis-01:24564] computing locality - shifting up from L2CACHE
> [Metropolis-01:24564] computing locality - shifting up from L3CACHE
> [Metropolis-01:24564] computing locality - filling level SOCKET
> [Metropolis-01:24564] computing locality - filling level NUMA
> [Metropolis-01:24564] locality: CL:CU:N:B:Nu:S
> [Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad]
> set priority to 10
> [Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm
> [Metropolis-01:24564] mca: base: components_open: Looking for coll
> components
> [Metropolis-01:24564] mca: base: components_open: opening coll components
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> tuned
> [Metropolis-01:24564] mca: base: components_open: component tuned has no
> register function
> [Metropolis-01:24564] coll:tuned:component_open: done!
> [Metropolis-01:24564] mca: base: components_open: component tuned open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component sm
> [Metropolis-01:24564] mca: base: components_open: component sm register
> function successful
> [Metropolis-01:24564] mca: base: components_open: component sm has no open
> function
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> libnbc
> [Metropolis-01:24564] mca: base: components_open: component libnbc
> register function successful
> [Metropolis-01:24564] mca: base: components_open: component libnbc open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> hierarch
> [Metropolis-01:24564] mca: base: components_open: component hierarch has
> no register function
> [Metropolis-01:24564] mca: base: components_open: component hierarch open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> basic
> [Metropolis-01:24564] mca: base: components_open: component basic register
> function successful
> [Metropolis-01:24564] mca: base: components_open: component basic has no
> open function
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> inter
> [Metropolis-01:24564] mca: base: components_open: component inter has no
> register function
> [Metropolis-01:24564] mca: base: components_open: component inter open
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component
> self
> [Metropolis-01:24564] mca: base: components_open: component self has no
> register function
> [Metropolis-01:24564] mca: base: components_open: component self open
> function successful
> [Metropolis-01:24565] computing locality - getting object at level CORE,
> index 1
> [Metropolis-01:24565] hwloc:base: get available cpus
> [Metropolis-01:24565] hwloc:base:get_available_cpus first time - filtering
> cpus
> [Metropolis-01:24565] hwloc:base: no cpus specified - using root available
> cpuset
> [Metropolis-01:24565] hwloc:base: get available cpus
> [Metropolis-01:24565] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24565] computing locality - getting object at level CORE,
> index 0
> [Metropolis-01:24565] computing locality - shifting up from L1CACHE
> [Metropolis-01:24565] computing locality - shifting up from L2CACHE
> [Metropolis-01:24565] computing locality - shifting up from L3CACHE
> [Metropolis-01:24565] computing locality - filling level SOCKET
> [Metropolis-01:24565] computing locality - filling level NUMA
> [Metropolis-01:24565] locality: CL:CU:N:B:Nu:S
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4
> entries
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing
> allgather
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted
> [Metropolis-01:24565] mca: base: components_open: Looking for coll
> components
> [Metropolis-01:24565] mca: base: components_open: opening coll components
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> tuned
> [Metropolis-01:24565] mca: base: components_open: component tuned has no
> register function
> [Metropolis-01:24565] coll:tuned:component_open: done!
> [Metropolis-01:24565] mca: base: components_open: component tuned open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component sm
> [Metropolis-01:24565] mca: base: components_open: component sm register
> function successful
> [Metropolis-01:24565] mca: base: components_open: component sm has no open
> function
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> libnbc
> [Metropolis-01:24565] mca: base: components_open: component libnbc
> register function successful
> [Metropolis-01:24565] mca: base: components_open: component libnbc open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> hierarch
> [Metropolis-01:24565] mca: base: components_open: component hierarch has
> no register function
> [Metropolis-01:24565] mca: base: components_open: component hierarch open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> basic
> [Metropolis-01:24565] mca: base: components_open: component basic register
> function successful
> [Metropolis-01:24565] mca: base: components_open: component basic has no
> open function
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> inter
> [Metropolis-01:24565] mca: base: components_open: component inter has no
> register function
> [Metropolis-01:24565] mca: base: components_open: component inter open
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component
> self
> [Metropolis-01:24565] mca: base: components_open: component self has no
> register function
> [Metropolis-01:24565] mca: base: components_open: component self open
> function successful
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE -
> SENDING TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING
> COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM
> CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,1] tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4
> entries
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing
> allgather
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing
> collective return for id 0
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0
> [Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],0]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing
> collective return for id 0
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0
> [Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],0]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],0]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],1]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],1]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],0]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex
> entry for proc [[36265,1],1]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries:
> adding 4 entries for proc [[36265,1],1]
> [Metropolis-01:24564] coll:find_available: querying coll component tuned
> [Metropolis-01:24564] coll:find_available: coll component tuned is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component tuned
> [Metropolis-01:24565] coll:find_available: coll component tuned is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component sm
> [Metropolis-01:24564] coll:find_available: querying coll component sm
> [Metropolis-01:24564] coll:sm:init_query: no other local procs;
> disqualifying myself
> [Metropolis-01:24564] coll:find_available: coll component sm is not
> available
> [Metropolis-01:24564] coll:find_available: querying coll component libnbc
> [Metropolis-01:24564] coll:find_available: coll component libnbc is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component hierarch
> [Metropolis-01:24564] coll:find_available: coll component hierarch is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component basic
> [Metropolis-01:24564] coll:find_available: coll component basic is
> available
> [Metropolis-01:24565] coll:sm:init_query: no other local procs;
> disqualifying myself
> [Metropolis-01:24565] coll:find_available: coll component sm is not
> available
> [Metropolis-01:24565] coll:find_available: querying coll component libnbc
> [Metropolis-01:24565] coll:find_available: coll component libnbc is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component hierarch
> [Metropolis-01:24565] coll:find_available: coll component hierarch is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component basic
> [Metropolis-01:24565] coll:find_available: coll component basic is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component inter
> [Metropolis-01:24564] coll:find_available: coll component inter is
> available
> [Metropolis-01:24564] coll:find_available: querying coll component self
> [Metropolis-01:24564] coll:find_available: coll component self is available
> [Metropolis-01:24565] coll:find_available: querying coll component inter
> [Metropolis-01:24565] coll:find_available: coll component inter is
> available
> [Metropolis-01:24565] coll:find_available: querying coll component self
> [Metropolis-01:24565] coll:find_available: coll component self is available
> [Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
> [Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE -
> SENDING TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING
> COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM
> CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,1] tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing
> collective return for id 1
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing
> collective return for id 1
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1
> [Metropolis-01:24565] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> [Metropolis-01:24565] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24565] coll:tuned:module_tuned query called
> [Metropolis-01:24565] coll:base:comm_select: component available: tuned,
> priority: 30
> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24565] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: inter
> [Metropolis-01:24565] coll:base:comm_select: component not available: self
> [Metropolis-01:24565] coll:tuned:module_init called.
> [Metropolis-01:24565] coll:tuned:module_init Tuned is in use
> [Metropolis-01:24565] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> [Metropolis-01:24565] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> [Metropolis-01:24564] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:tuned:module_tuned query called
> [Metropolis-01:24564] coll:base:comm_select: component available: tuned,
> priority: 30
> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24564] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: inter
> [Metropolis-01:24564] coll:base:comm_select: component not available: self
> [Metropolis-01:24564] coll:tuned:module_init called.
> [Metropolis-01:24565] coll:tuned:module_tuned query called
> [Metropolis-01:24565] coll:base:comm_select: component not available: tuned
> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24565] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: inter
> [Metropolis-01:24565] coll:base:comm_select: component available: self,
> priority: 75
> [Metropolis-01:24564] coll:tuned:module_init Tuned is in use
> [Metropolis-01:24564] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> [Metropolis-01:24564] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:tuned:module_tuned query called
> [Metropolis-01:24564] coll:base:comm_select: component not available: tuned
> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available:
> hierarch
> [Metropolis-01:24564] coll:base:comm_select: component available: basic,
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: inter
> [Metropolis-01:24564] coll:base:comm_select: component available: self,
> priority: 75
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE -
> SENDING TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING
> COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM
> CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,1] tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing
> collective return for id 2
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing
> collective return for id 2
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2
> [Metropolis-01:24565] coll:tuned:component_close: called
> [Metropolis-01:24565] coll:tuned:component_close: done!
> [Metropolis-01:24565] mca: base: close: component tuned closed
> [Metropolis-01:24565] mca: base: close: unloading component tuned
> [Metropolis-01:24565] mca: base: close: component libnbc closed
> [Metropolis-01:24565] mca: base: close: unloading component libnbc
> [Metropolis-01:24565] mca: base: close: unloading component hierarch
> [Metropolis-01:24565] mca: base: close: unloading component basic
> [Metropolis-01:24565] mca: base: close: unloading component inter
> [Metropolis-01:24565] mca: base: close: unloading component self
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm
> [Metropolis-01:24564] coll:tuned:component_close: called
> [Metropolis-01:24564] coll:tuned:component_close: done!
> [Metropolis-01:24564] mca: base: close: component tuned closed
> [Metropolis-01:24564] mca: base: close: unloading component tuned
> [Metropolis-01:24564] mca: base: close: component libnbc closed
> [Metropolis-01:24564] mca: base: close: unloading component libnbc
> [Metropolis-01:24564] mca: base: close: unloading component hierarch
> [Metropolis-01:24564] mca: base: close: unloading component basic
> [Metropolis-01:24564] mca: base: close: unloading component inter
> [Metropolis-01:24564] mca: base: close: unloading component self
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job
> [36265,0] tag 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient
> list is empty!
> [jarico_at_Metropolis-01 examples]$
>
>
>
> El 03/07/2012, a las 21:44, Ralph Castain escribió:
>
> > Interesting - yes, coll sm doesn't think they are on the same node for
> some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why
> >
> >
> > On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote:
> >
> >> The code I run is a simple broadcast.
> >>
> >> When I do not specify components to run, the output is (more verbose):
> >>
> >> [jarico_at_Metropolis-01 examples]$
> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca
> mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority
> 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca
> mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem
> >> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc
> components
> >> [Metropolis-01:24490] mca: base: components_open: opening hwloc
> components
> >> [Metropolis-01:24490] mca: base: components_open: found loaded
> component hwloc142
> >> [Metropolis-01:24490] mca: base: components_open: component hwloc142
> has no register function
> >> [Metropolis-01:24490] mca: base: components_open: component hwloc142
> has no open function
> >> [Metropolis-01:24490] hwloc:base:get_topology
> >> [Metropolis-01:24490] hwloc:base: no cpus specified - using root
> available cpuset
> >>
> >> ======================== JOB MAP ========================
> >>
> >> Data for node: Metropolis-01 Num procs: 2
> >> Process OMPI jobid: [36336,1] App: 0 Process rank: 0
> >> Process OMPI jobid: [36336,1] App: 0 Process rank: 1
> >>
> >> =============================================================
> >> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc
> components
> >> [Metropolis-01:24491] mca: base: components_open: opening hwloc
> components
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component hwloc142
> >> [Metropolis-01:24491] mca: base: components_open: component hwloc142
> has no register function
> >> [Metropolis-01:24491] mca: base: components_open: component hwloc142
> has no open function
> >> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc
> components
> >> [Metropolis-01:24492] mca: base: components_open: opening hwloc
> components
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component hwloc142
> >> [Metropolis-01:24492] mca: base: components_open: component hwloc142
> has no register function
> >> [Metropolis-01:24492] mca: base: components_open: component hwloc142
> has no open function
> >> [Metropolis-01:24491] locality: CL:CU:N:B
> >> [Metropolis-01:24491] hwloc:base: get available cpus
> >> [Metropolis-01:24491] hwloc:base:get_available_cpus first time -
> filtering cpus
> >> [Metropolis-01:24491] hwloc:base: no cpus specified - using root
> available cpuset
> >> [Metropolis-01:24491] hwloc:base:get_available_cpus root object
> >> [Metropolis-01:24491] mca: base: components_open: Looking for coll
> components
> >> [Metropolis-01:24491] mca: base: components_open: opening coll
> components
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component tuned
> >> [Metropolis-01:24491] mca: base: components_open: component tuned has
> no register function
> >> [Metropolis-01:24491] coll:tuned:component_open: done!
> >> [Metropolis-01:24491] mca: base: components_open: component tuned open
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component sm
> >> [Metropolis-01:24491] mca: base: components_open: component sm register
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: component sm has no
> open function
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component libnbc
> >> [Metropolis-01:24491] mca: base: components_open: component libnbc
> register function successful
> >> [Metropolis-01:24491] mca: base: components_open: component libnbc open
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component hierarch
> >> [Metropolis-01:24491] mca: base: components_open: component hierarch
> has no register function
> >> [Metropolis-01:24491] mca: base: components_open: component hierarch
> open function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component basic
> >> [Metropolis-01:24491] mca: base: components_open: component basic
> register function successful
> >> [Metropolis-01:24491] mca: base: components_open: component basic has
> no open function
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component inter
> >> [Metropolis-01:24491] mca: base: components_open: component inter has
> no register function
> >> [Metropolis-01:24491] mca: base: components_open: component inter open
> function successful
> >> [Metropolis-01:24491] mca: base: components_open: found loaded
> component self
> >> [Metropolis-01:24491] mca: base: components_open: component self has no
> register function
> >> [Metropolis-01:24491] mca: base: components_open: component self open
> function successful
> >> [Metropolis-01:24492] locality: CL:CU:N:B
> >> [Metropolis-01:24492] hwloc:base: get available cpus
> >> [Metropolis-01:24492] hwloc:base:get_available_cpus first time -
> filtering cpus
> >> [Metropolis-01:24492] hwloc:base: no cpus specified - using root
> available cpuset
> >> [Metropolis-01:24492] hwloc:base:get_available_cpus root object
> >> [Metropolis-01:24492] mca: base: components_open: Looking for coll
> components
> >> [Metropolis-01:24492] mca: base: components_open: opening coll
> components
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component tuned
> >> [Metropolis-01:24492] mca: base: components_open: component tuned has
> no register function
> >> [Metropolis-01:24492] coll:tuned:component_open: done!
> >> [Metropolis-01:24492] mca: base: components_open: component tuned open
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component sm
> >> [Metropolis-01:24492] mca: base: components_open: component sm register
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: component sm has no
> open function
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component libnbc
> >> [Metropolis-01:24492] mca: base: components_open: component libnbc
> register function successful
> >> [Metropolis-01:24492] mca: base: components_open: component libnbc open
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component hierarch
> >> [Metropolis-01:24492] mca: base: components_open: component hierarch
> has no register function
> >> [Metropolis-01:24492] mca: base: components_open: component hierarch
> open function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component basic
> >> [Metropolis-01:24492] mca: base: components_open: component basic
> register function successful
> >> [Metropolis-01:24492] mca: base: components_open: component basic has
> no open function
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component inter
> >> [Metropolis-01:24492] mca: base: components_open: component inter has
> no register function
> >> [Metropolis-01:24492] mca: base: components_open: component inter open
> function successful
> >> [Metropolis-01:24492] mca: base: components_open: found loaded
> component self
> >> [Metropolis-01:24492] mca: base: components_open: component self has no
> register function
> >> [Metropolis-01:24492] mca: base: components_open: component self open
> function successful
> >> [Metropolis-01:24491] coll:find_available: querying coll component tuned
> >> [Metropolis-01:24491] coll:find_available: coll component tuned is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component sm
> >> [Metropolis-01:24491] coll:sm:init_query: no other local procs;
> disqualifying myself
> >> [Metropolis-01:24491] coll:find_available: coll component sm is not
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component
> libnbc
> >> [Metropolis-01:24491] coll:find_available: coll component libnbc is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component
> hierarch
> >> [Metropolis-01:24491] coll:find_available: coll component hierarch is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component basic
> >> [Metropolis-01:24491] coll:find_available: coll component basic is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component inter
> >> [Metropolis-01:24492] coll:find_available: querying coll component tuned
> >> [Metropolis-01:24492] coll:find_available: coll component tuned is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component sm
> >> [Metropolis-01:24492] coll:sm:init_query: no other local procs;
> disqualifying myself
> >> [Metropolis-01:24492] coll:find_available: coll component sm is not
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component
> libnbc
> >> [Metropolis-01:24492] coll:find_available: coll component libnbc is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component
> hierarch
> >> [Metropolis-01:24492] coll:find_available: coll component hierarch is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component basic
> >> [Metropolis-01:24492] coll:find_available: coll component basic is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component inter
> >> [Metropolis-01:24492] coll:find_available: coll component inter is
> available
> >> [Metropolis-01:24492] coll:find_available: querying coll component self
> >> [Metropolis-01:24492] coll:find_available: coll component self is
> available
> >> [Metropolis-01:24491] coll:find_available: coll component inter is
> available
> >> [Metropolis-01:24491] coll:find_available: querying coll component self
> >> [Metropolis-01:24491] coll:find_available: coll component self is
> available
> >> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >> [Metropolis-01:24491] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> >> [Metropolis-01:24491] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24491] coll:tuned:module_tuned query called
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> tuned, priority: 30
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> self
> >> [Metropolis-01:24491] coll:tuned:module_init called.
> >> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use
> >> [Metropolis-01:24491] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> >> [Metropolis-01:24491] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24491] coll:tuned:module_tuned query called
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> tuned
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24491] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24491] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24491] coll:base:comm_select: component available: self,
> priority: 75
> >> [Metropolis-01:24492] coll:base:comm_select: new communicator:
> MPI_COMM_WORLD (cid 0)
> >> [Metropolis-01:24492] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24492] coll:tuned:module_tuned query called
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> tuned, priority: 30
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> self
> >> [Metropolis-01:24492] coll:tuned:module_init called.
> >> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use
> >> [Metropolis-01:24492] coll:base:comm_select: new communicator:
> MPI_COMM_SELF (cid 1)
> >> [Metropolis-01:24492] coll:base:comm_select: Checking all available
> modules
> >> [Metropolis-01:24492] coll:tuned:module_tuned query called
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> tuned
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> libnbc, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> hierarch
> >> [Metropolis-01:24492] coll:base:comm_select: component available:
> basic, priority: 10
> >> [Metropolis-01:24492] coll:base:comm_select: component not available:
> inter
> >> [Metropolis-01:24492] coll:base:comm_select: component available: self,
> priority: 75
> >> [Metropolis-01:24491] coll:tuned:component_close: called
> >> [Metropolis-01:24491] coll:tuned:component_close: done!
> >> [Metropolis-01:24492] coll:tuned:component_close: called
> >> [Metropolis-01:24492] coll:tuned:component_close: done!
> >> [Metropolis-01:24492] mca: base: close: component tuned closed
> >> [Metropolis-01:24492] mca: base: close: unloading component tuned
> >> [Metropolis-01:24492] mca: base: close: component libnbc closed
> >> [Metropolis-01:24492] mca: base: close: unloading component libnbc
> >> [Metropolis-01:24492] mca: base: close: unloading component hierarch
> >> [Metropolis-01:24492] mca: base: close: unloading component basic
> >> [Metropolis-01:24492] mca: base: close: unloading component inter
> >> [Metropolis-01:24492] mca: base: close: unloading component self
> >> [Metropolis-01:24491] mca: base: close: component tuned closed
> >> [Metropolis-01:24491] mca: base: close: unloading component tuned
> >> [Metropolis-01:24491] mca: base: close: component libnbc closed
> >> [Metropolis-01:24491] mca: base: close: unloading component libnbc
> >> [Metropolis-01:24491] mca: base: close: unloading component hierarch
> >> [Metropolis-01:24491] mca: base: close: unloading component basic
> >> [Metropolis-01:24491] mca: base: close: unloading component inter
> >> [Metropolis-01:24491] mca: base: close: unloading component self
> >> [jarico_at_Metropolis-01 examples]$
> >>
> >>
> >> SM is not load because it detects no other processes in the same
> machine:
> >>
> >> [Metropolis-01:24491] coll:sm:init_query: no other local procs;
> disqualifying myself
> >>
> >> The machine is a multicore machine with 8 cores.
> >>
> >> I need to run SM component code, and I suppose that raising priority it
> will be the component selected when problem is solved.
> >>
> >>
> >>
> >> El 03/07/2012, a las 21:01, Jeff Squyres escribió:
> >>
> >>> The issue is that the "sm" coll component only implements a few of the
> MPI collective operations. It is usually mixed at run-time with other coll
> components to fill out the rest of the MPI collective operations.
> >>>
> >>> So what is happening is that OMPI is determining that it doesn't have
> implementations of all the MPI collective operations and aborting.
> >>>
> >>> You shouldn't need to manually select your coll module -- OMPI should
> automatically select the right collective module for you. E.g., if all
> procs are local on a single machine and sm has a matching implementation
> for that MPI collective operation, it'll be used.
> >>>
> >>>
> >>>
> >>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote:
> >>>
> >>>> Output is:
> >>>>
> >>>> [Metropolis-01:15355] hwloc:base:get_topology
> >>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root
> available cpuset
> >>>>
> >>>> ======================== JOB MAP ========================
> >>>>
> >>>> Data for node: Metropolis-01 Num procs: 2
> >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 0
> >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 1
> >>>>
> >>>> =============================================================
> >>>> [Metropolis-01:15356] locality: CL:CU:N:B
> >>>> [Metropolis-01:15356] hwloc:base: get available cpus
> >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time -
> filtering cpus
> >>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root
> available cpuset
> >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object
> >>>> [Metropolis-01:15357] locality: CL:CU:N:B
> >>>> [Metropolis-01:15357] hwloc:base: get available cpus
> >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time -
> filtering cpus
> >>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root
> available cpuset
> >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object
> >>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of
> NUMANode:0
> >>>>
> >>>>
> >>>> Regards,
> >>>> Juan A. Rico
> >>>> _______________________________________________
> >>>> devel mailing list
> >>>> devel_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>>
> >>>
> >>> --
> >>> Jeff Squyres
> >>> jsquyres_at_[hidden]
> >>> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>