Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SM component init unload
From: Juan A. Rico (jarico_at_[hidden])
Date: 2012-07-03 15:50:05


Here is the output.

[jarico_at_Metropolis-01 examples]$ /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 -mca grpcomm_base_verbose 5 ./bmem
[Metropolis-01:24563] mca: base: components_open: Looking for hwloc components
[Metropolis-01:24563] mca: base: components_open: opening hwloc components
[Metropolis-01:24563] mca: base: components_open: found loaded component hwloc142
[Metropolis-01:24563] mca: base: components_open: component hwloc142 has no register function
[Metropolis-01:24563] mca: base: components_open: component hwloc142 has no open function
[Metropolis-01:24563] hwloc:base:get_topology
[Metropolis-01:24563] hwloc:base: no cpus specified - using root available cpuset
[Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad]
[Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad]
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm
--------------------------------------------------------------------------
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node: Metropolis-01

This is a warning only; your job will continue, though performance may
be degraded.
--------------------------------------------------------------------------
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24563] hwloc:base: get available cpus
[Metropolis-01:24563] hwloc:base:filter_cpus specified - already done

 ======================== JOB MAP ========================

 Data for node: Metropolis-01 Num procs: 2
         Process OMPI jobid: [36265,1] App: 0 Process rank: 0
         Process OMPI jobid: [36265,1] App: 0 Process rank: 1

 =============================================================
[Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,0] tag 1
[Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon nidmap
[Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty!
[Metropolis-01:24564] mca: base: components_open: Looking for hwloc components
[Metropolis-01:24564] mca: base: components_open: opening hwloc components
[Metropolis-01:24564] mca: base: components_open: found loaded component hwloc142
[Metropolis-01:24564] mca: base: components_open: component hwloc142 has no register function
[Metropolis-01:24564] mca: base: components_open: component hwloc142 has no open function
[Metropolis-01:24565] mca: base: components_open: Looking for hwloc components
[Metropolis-01:24565] mca: base: components_open: opening hwloc components
[Metropolis-01:24565] mca: base: components_open: found loaded component hwloc142
[Metropolis-01:24565] mca: base: components_open: component hwloc142 has no register function
[Metropolis-01:24565] mca: base: components_open: component hwloc142 has no open function
[Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad]
[Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad]
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm
[Metropolis-01:24564] computing locality - getting object at level CORE, index 0
[Metropolis-01:24564] hwloc:base: get available cpus
[Metropolis-01:24564] hwloc:base:get_available_cpus first time - filtering cpus
[Metropolis-01:24564] hwloc:base: no cpus specified - using root available cpuset
[Metropolis-01:24564] computing locality - getting object at level CORE, index 1
[Metropolis-01:24564] hwloc:base: get available cpus
[Metropolis-01:24564] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24564] computing locality - shifting up from L1CACHE
[Metropolis-01:24564] computing locality - shifting up from L2CACHE
[Metropolis-01:24564] computing locality - shifting up from L3CACHE
[Metropolis-01:24564] computing locality - filling level SOCKET
[Metropolis-01:24564] computing locality - filling level NUMA
[Metropolis-01:24564] locality: CL:CU:N:B:Nu:S
[Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad]
[Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad]
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm
[Metropolis-01:24564] mca: base: components_open: Looking for coll components
[Metropolis-01:24564] mca: base: components_open: opening coll components
[Metropolis-01:24564] mca: base: components_open: found loaded component tuned
[Metropolis-01:24564] mca: base: components_open: component tuned has no register function
[Metropolis-01:24564] coll:tuned:component_open: done!
[Metropolis-01:24564] mca: base: components_open: component tuned open function successful
[Metropolis-01:24564] mca: base: components_open: found loaded component sm
[Metropolis-01:24564] mca: base: components_open: component sm register function successful
[Metropolis-01:24564] mca: base: components_open: component sm has no open function
[Metropolis-01:24564] mca: base: components_open: found loaded component libnbc
[Metropolis-01:24564] mca: base: components_open: component libnbc register function successful
[Metropolis-01:24564] mca: base: components_open: component libnbc open function successful
[Metropolis-01:24564] mca: base: components_open: found loaded component hierarch
[Metropolis-01:24564] mca: base: components_open: component hierarch has no register function
[Metropolis-01:24564] mca: base: components_open: component hierarch open function successful
[Metropolis-01:24564] mca: base: components_open: found loaded component basic
[Metropolis-01:24564] mca: base: components_open: component basic register function successful
[Metropolis-01:24564] mca: base: components_open: component basic has no open function
[Metropolis-01:24564] mca: base: components_open: found loaded component inter
[Metropolis-01:24564] mca: base: components_open: component inter has no register function
[Metropolis-01:24564] mca: base: components_open: component inter open function successful
[Metropolis-01:24564] mca: base: components_open: found loaded component self
[Metropolis-01:24564] mca: base: components_open: component self has no register function
[Metropolis-01:24564] mca: base: components_open: component self open function successful
[Metropolis-01:24565] computing locality - getting object at level CORE, index 1
[Metropolis-01:24565] hwloc:base: get available cpus
[Metropolis-01:24565] hwloc:base:get_available_cpus first time - filtering cpus
[Metropolis-01:24565] hwloc:base: no cpus specified - using root available cpuset
[Metropolis-01:24565] hwloc:base: get available cpus
[Metropolis-01:24565] hwloc:base:filter_cpus specified - already done
[Metropolis-01:24565] computing locality - getting object at level CORE, index 0
[Metropolis-01:24565] computing locality - shifting up from L1CACHE
[Metropolis-01:24565] computing locality - shifting up from L2CACHE
[Metropolis-01:24565] computing locality - shifting up from L3CACHE
[Metropolis-01:24565] computing locality - filling level SOCKET
[Metropolis-01:24565] computing locality - filling level NUMA
[Metropolis-01:24565] locality: CL:CU:N:B:Nu:S
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
[Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
[Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO PARTICIPANTS
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
[Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4 entries
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing allgather
[Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather
[Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted
[Metropolis-01:24565] mca: base: components_open: Looking for coll components
[Metropolis-01:24565] mca: base: components_open: opening coll components
[Metropolis-01:24565] mca: base: components_open: found loaded component tuned
[Metropolis-01:24565] mca: base: components_open: component tuned has no register function
[Metropolis-01:24565] coll:tuned:component_open: done!
[Metropolis-01:24565] mca: base: components_open: component tuned open function successful
[Metropolis-01:24565] mca: base: components_open: found loaded component sm
[Metropolis-01:24565] mca: base: components_open: component sm register function successful
[Metropolis-01:24565] mca: base: components_open: component sm has no open function
[Metropolis-01:24565] mca: base: components_open: found loaded component libnbc
[Metropolis-01:24565] mca: base: components_open: component libnbc register function successful
[Metropolis-01:24565] mca: base: components_open: component libnbc open function successful
[Metropolis-01:24565] mca: base: components_open: found loaded component hierarch
[Metropolis-01:24565] mca: base: components_open: component hierarch has no register function
[Metropolis-01:24565] mca: base: components_open: component hierarch open function successful
[Metropolis-01:24565] mca: base: components_open: found loaded component basic
[Metropolis-01:24565] mca: base: components_open: component basic register function successful
[Metropolis-01:24565] mca: base: components_open: component basic has no open function
[Metropolis-01:24565] mca: base: components_open: found loaded component inter
[Metropolis-01:24565] mca: base: components_open: component inter has no register function
[Metropolis-01:24565] mca: base: components_open: component inter open function successful
[Metropolis-01:24565] mca: base: components_open: found loaded component self
[Metropolis-01:24565] mca: base: components_open: component self has no register function
[Metropolis-01:24565] mca: base: components_open: component self open function successful
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
[Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
[Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[36265,0],0]
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 0
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2
[Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] tag 30
[Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
[Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty!
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4 entries
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing allgather
[Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather
[Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing collective return for id 0
[Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0
[Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex entry for proc [[36265,1],0]
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing collective return for id 0
[Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0
[Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex entry for proc [[36265,1],0]
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],0]
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex entry for proc [[36265,1],1]
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],1]
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],0]
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex entry for proc [[36265,1],1]
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],1]
[Metropolis-01:24564] coll:find_available: querying coll component tuned
[Metropolis-01:24564] coll:find_available: coll component tuned is available
[Metropolis-01:24565] coll:find_available: querying coll component tuned
[Metropolis-01:24565] coll:find_available: coll component tuned is available
[Metropolis-01:24565] coll:find_available: querying coll component sm
[Metropolis-01:24564] coll:find_available: querying coll component sm
[Metropolis-01:24564] coll:sm:init_query: no other local procs; disqualifying myself
[Metropolis-01:24564] coll:find_available: coll component sm is not available
[Metropolis-01:24564] coll:find_available: querying coll component libnbc
[Metropolis-01:24564] coll:find_available: coll component libnbc is available
[Metropolis-01:24564] coll:find_available: querying coll component hierarch
[Metropolis-01:24564] coll:find_available: coll component hierarch is available
[Metropolis-01:24564] coll:find_available: querying coll component basic
[Metropolis-01:24564] coll:find_available: coll component basic is available
[Metropolis-01:24565] coll:sm:init_query: no other local procs; disqualifying myself
[Metropolis-01:24565] coll:find_available: coll component sm is not available
[Metropolis-01:24565] coll:find_available: querying coll component libnbc
[Metropolis-01:24565] coll:find_available: coll component libnbc is available
[Metropolis-01:24565] coll:find_available: querying coll component hierarch
[Metropolis-01:24565] coll:find_available: coll component hierarch is available
[Metropolis-01:24565] coll:find_available: querying coll component basic
[Metropolis-01:24565] coll:find_available: coll component basic is available
[Metropolis-01:24564] coll:find_available: querying coll component inter
[Metropolis-01:24564] coll:find_available: coll component inter is available
[Metropolis-01:24564] coll:find_available: querying coll component self
[Metropolis-01:24564] coll:find_available: coll component self is available
[Metropolis-01:24565] coll:find_available: querying coll component inter
[Metropolis-01:24565] coll:find_available: coll component inter is available
[Metropolis-01:24565] coll:find_available: querying coll component self
[Metropolis-01:24565] coll:find_available: coll component self is available
[Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
[Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
[Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
[Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO PARTICIPANTS
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
[Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
[Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
[Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[36265,0],0]
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 1
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2
[Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] tag 30
[Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
[Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty!
[Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
[Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
[Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
[Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing collective return for id 1
[Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing collective return for id 1
[Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1
[Metropolis-01:24565] coll:base:comm_select: new communicator: MPI_COMM_WORLD (cid 0)
[Metropolis-01:24565] coll:base:comm_select: Checking all available modules
[Metropolis-01:24565] coll:tuned:module_tuned query called
[Metropolis-01:24565] coll:base:comm_select: component available: tuned, priority: 30
[Metropolis-01:24565] coll:base:comm_select: component available: libnbc, priority: 10
[Metropolis-01:24565] coll:base:comm_select: component not available: hierarch
[Metropolis-01:24565] coll:base:comm_select: component available: basic, priority: 10
[Metropolis-01:24565] coll:base:comm_select: component not available: inter
[Metropolis-01:24565] coll:base:comm_select: component not available: self
[Metropolis-01:24565] coll:tuned:module_init called.
[Metropolis-01:24565] coll:tuned:module_init Tuned is in use
[Metropolis-01:24565] coll:base:comm_select: new communicator: MPI_COMM_SELF (cid 1)
[Metropolis-01:24565] coll:base:comm_select: Checking all available modules
[Metropolis-01:24564] coll:base:comm_select: new communicator: MPI_COMM_WORLD (cid 0)
[Metropolis-01:24564] coll:base:comm_select: Checking all available modules
[Metropolis-01:24564] coll:tuned:module_tuned query called
[Metropolis-01:24564] coll:base:comm_select: component available: tuned, priority: 30
[Metropolis-01:24564] coll:base:comm_select: component available: libnbc, priority: 10
[Metropolis-01:24564] coll:base:comm_select: component not available: hierarch
[Metropolis-01:24564] coll:base:comm_select: component available: basic, priority: 10
[Metropolis-01:24564] coll:base:comm_select: component not available: inter
[Metropolis-01:24564] coll:base:comm_select: component not available: self
[Metropolis-01:24564] coll:tuned:module_init called.
[Metropolis-01:24565] coll:tuned:module_tuned query called
[Metropolis-01:24565] coll:base:comm_select: component not available: tuned
[Metropolis-01:24565] coll:base:comm_select: component available: libnbc, priority: 10
[Metropolis-01:24565] coll:base:comm_select: component not available: hierarch
[Metropolis-01:24565] coll:base:comm_select: component available: basic, priority: 10
[Metropolis-01:24565] coll:base:comm_select: component not available: inter
[Metropolis-01:24565] coll:base:comm_select: component available: self, priority: 75
[Metropolis-01:24564] coll:tuned:module_init Tuned is in use
[Metropolis-01:24564] coll:base:comm_select: new communicator: MPI_COMM_SELF (cid 1)
[Metropolis-01:24564] coll:base:comm_select: Checking all available modules
[Metropolis-01:24564] coll:tuned:module_tuned query called
[Metropolis-01:24564] coll:base:comm_select: component not available: tuned
[Metropolis-01:24564] coll:base:comm_select: component available: libnbc, priority: 10
[Metropolis-01:24564] coll:base:comm_select: component not available: hierarch
[Metropolis-01:24564] coll:base:comm_select: component available: basic, priority: 10
[Metropolis-01:24564] coll:base:comm_select: component not available: inter
[Metropolis-01:24564] coll:base:comm_select: component available: self, priority: 75
[Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
[Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
[Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO PARTICIPANTS
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
[Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
[Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
[Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
[Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
[Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[36265,0],0]
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 2
[Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2
[Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] tag 30
[Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
[Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty!
[Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
[Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing collective return for id 2
[Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2
[Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing collective return for id 2
[Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2
[Metropolis-01:24565] coll:tuned:component_close: called
[Metropolis-01:24565] coll:tuned:component_close: done!
[Metropolis-01:24565] mca: base: close: component tuned closed
[Metropolis-01:24565] mca: base: close: unloading component tuned
[Metropolis-01:24565] mca: base: close: component libnbc closed
[Metropolis-01:24565] mca: base: close: unloading component libnbc
[Metropolis-01:24565] mca: base: close: unloading component hierarch
[Metropolis-01:24565] mca: base: close: unloading component basic
[Metropolis-01:24565] mca: base: close: unloading component inter
[Metropolis-01:24565] mca: base: close: unloading component self
[Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm
[Metropolis-01:24564] coll:tuned:component_close: called
[Metropolis-01:24564] coll:tuned:component_close: done!
[Metropolis-01:24564] mca: base: close: component tuned closed
[Metropolis-01:24564] mca: base: close: unloading component tuned
[Metropolis-01:24564] mca: base: close: component libnbc closed
[Metropolis-01:24564] mca: base: close: unloading component libnbc
[Metropolis-01:24564] mca: base: close: unloading component hierarch
[Metropolis-01:24564] mca: base: close: unloading component basic
[Metropolis-01:24564] mca: base: close: unloading component inter
[Metropolis-01:24564] mca: base: close: unloading component self
[Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm
[Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,0] tag 1
[Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
[Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty!
[jarico_at_Metropolis-01 examples]$

El 03/07/2012, a las 21:44, Ralph Castain escribió:

> Interesting - yes, coll sm doesn't think they are on the same node for some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why
>
>
> On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote:
>
>> The code I run is a simple broadcast.
>>
>> When I do not specify components to run, the output is (more verbose):
>>
>> [jarico_at_Metropolis-01 examples]$ /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem
>> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc components
>> [Metropolis-01:24490] mca: base: components_open: opening hwloc components
>> [Metropolis-01:24490] mca: base: components_open: found loaded component hwloc142
>> [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no register function
>> [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no open function
>> [Metropolis-01:24490] hwloc:base:get_topology
>> [Metropolis-01:24490] hwloc:base: no cpus specified - using root available cpuset
>>
>> ======================== JOB MAP ========================
>>
>> Data for node: Metropolis-01 Num procs: 2
>> Process OMPI jobid: [36336,1] App: 0 Process rank: 0
>> Process OMPI jobid: [36336,1] App: 0 Process rank: 1
>>
>> =============================================================
>> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc components
>> [Metropolis-01:24491] mca: base: components_open: opening hwloc components
>> [Metropolis-01:24491] mca: base: components_open: found loaded component hwloc142
>> [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no register function
>> [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no open function
>> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc components
>> [Metropolis-01:24492] mca: base: components_open: opening hwloc components
>> [Metropolis-01:24492] mca: base: components_open: found loaded component hwloc142
>> [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no register function
>> [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no open function
>> [Metropolis-01:24491] locality: CL:CU:N:B
>> [Metropolis-01:24491] hwloc:base: get available cpus
>> [Metropolis-01:24491] hwloc:base:get_available_cpus first time - filtering cpus
>> [Metropolis-01:24491] hwloc:base: no cpus specified - using root available cpuset
>> [Metropolis-01:24491] hwloc:base:get_available_cpus root object
>> [Metropolis-01:24491] mca: base: components_open: Looking for coll components
>> [Metropolis-01:24491] mca: base: components_open: opening coll components
>> [Metropolis-01:24491] mca: base: components_open: found loaded component tuned
>> [Metropolis-01:24491] mca: base: components_open: component tuned has no register function
>> [Metropolis-01:24491] coll:tuned:component_open: done!
>> [Metropolis-01:24491] mca: base: components_open: component tuned open function successful
>> [Metropolis-01:24491] mca: base: components_open: found loaded component sm
>> [Metropolis-01:24491] mca: base: components_open: component sm register function successful
>> [Metropolis-01:24491] mca: base: components_open: component sm has no open function
>> [Metropolis-01:24491] mca: base: components_open: found loaded component libnbc
>> [Metropolis-01:24491] mca: base: components_open: component libnbc register function successful
>> [Metropolis-01:24491] mca: base: components_open: component libnbc open function successful
>> [Metropolis-01:24491] mca: base: components_open: found loaded component hierarch
>> [Metropolis-01:24491] mca: base: components_open: component hierarch has no register function
>> [Metropolis-01:24491] mca: base: components_open: component hierarch open function successful
>> [Metropolis-01:24491] mca: base: components_open: found loaded component basic
>> [Metropolis-01:24491] mca: base: components_open: component basic register function successful
>> [Metropolis-01:24491] mca: base: components_open: component basic has no open function
>> [Metropolis-01:24491] mca: base: components_open: found loaded component inter
>> [Metropolis-01:24491] mca: base: components_open: component inter has no register function
>> [Metropolis-01:24491] mca: base: components_open: component inter open function successful
>> [Metropolis-01:24491] mca: base: components_open: found loaded component self
>> [Metropolis-01:24491] mca: base: components_open: component self has no register function
>> [Metropolis-01:24491] mca: base: components_open: component self open function successful
>> [Metropolis-01:24492] locality: CL:CU:N:B
>> [Metropolis-01:24492] hwloc:base: get available cpus
>> [Metropolis-01:24492] hwloc:base:get_available_cpus first time - filtering cpus
>> [Metropolis-01:24492] hwloc:base: no cpus specified - using root available cpuset
>> [Metropolis-01:24492] hwloc:base:get_available_cpus root object
>> [Metropolis-01:24492] mca: base: components_open: Looking for coll components
>> [Metropolis-01:24492] mca: base: components_open: opening coll components
>> [Metropolis-01:24492] mca: base: components_open: found loaded component tuned
>> [Metropolis-01:24492] mca: base: components_open: component tuned has no register function
>> [Metropolis-01:24492] coll:tuned:component_open: done!
>> [Metropolis-01:24492] mca: base: components_open: component tuned open function successful
>> [Metropolis-01:24492] mca: base: components_open: found loaded component sm
>> [Metropolis-01:24492] mca: base: components_open: component sm register function successful
>> [Metropolis-01:24492] mca: base: components_open: component sm has no open function
>> [Metropolis-01:24492] mca: base: components_open: found loaded component libnbc
>> [Metropolis-01:24492] mca: base: components_open: component libnbc register function successful
>> [Metropolis-01:24492] mca: base: components_open: component libnbc open function successful
>> [Metropolis-01:24492] mca: base: components_open: found loaded component hierarch
>> [Metropolis-01:24492] mca: base: components_open: component hierarch has no register function
>> [Metropolis-01:24492] mca: base: components_open: component hierarch open function successful
>> [Metropolis-01:24492] mca: base: components_open: found loaded component basic
>> [Metropolis-01:24492] mca: base: components_open: component basic register function successful
>> [Metropolis-01:24492] mca: base: components_open: component basic has no open function
>> [Metropolis-01:24492] mca: base: components_open: found loaded component inter
>> [Metropolis-01:24492] mca: base: components_open: component inter has no register function
>> [Metropolis-01:24492] mca: base: components_open: component inter open function successful
>> [Metropolis-01:24492] mca: base: components_open: found loaded component self
>> [Metropolis-01:24492] mca: base: components_open: component self has no register function
>> [Metropolis-01:24492] mca: base: components_open: component self open function successful
>> [Metropolis-01:24491] coll:find_available: querying coll component tuned
>> [Metropolis-01:24491] coll:find_available: coll component tuned is available
>> [Metropolis-01:24491] coll:find_available: querying coll component sm
>> [Metropolis-01:24491] coll:sm:init_query: no other local procs; disqualifying myself
>> [Metropolis-01:24491] coll:find_available: coll component sm is not available
>> [Metropolis-01:24491] coll:find_available: querying coll component libnbc
>> [Metropolis-01:24491] coll:find_available: coll component libnbc is available
>> [Metropolis-01:24491] coll:find_available: querying coll component hierarch
>> [Metropolis-01:24491] coll:find_available: coll component hierarch is available
>> [Metropolis-01:24491] coll:find_available: querying coll component basic
>> [Metropolis-01:24491] coll:find_available: coll component basic is available
>> [Metropolis-01:24491] coll:find_available: querying coll component inter
>> [Metropolis-01:24492] coll:find_available: querying coll component tuned
>> [Metropolis-01:24492] coll:find_available: coll component tuned is available
>> [Metropolis-01:24492] coll:find_available: querying coll component sm
>> [Metropolis-01:24492] coll:sm:init_query: no other local procs; disqualifying myself
>> [Metropolis-01:24492] coll:find_available: coll component sm is not available
>> [Metropolis-01:24492] coll:find_available: querying coll component libnbc
>> [Metropolis-01:24492] coll:find_available: coll component libnbc is available
>> [Metropolis-01:24492] coll:find_available: querying coll component hierarch
>> [Metropolis-01:24492] coll:find_available: coll component hierarch is available
>> [Metropolis-01:24492] coll:find_available: querying coll component basic
>> [Metropolis-01:24492] coll:find_available: coll component basic is available
>> [Metropolis-01:24492] coll:find_available: querying coll component inter
>> [Metropolis-01:24492] coll:find_available: coll component inter is available
>> [Metropolis-01:24492] coll:find_available: querying coll component self
>> [Metropolis-01:24492] coll:find_available: coll component self is available
>> [Metropolis-01:24491] coll:find_available: coll component inter is available
>> [Metropolis-01:24491] coll:find_available: querying coll component self
>> [Metropolis-01:24491] coll:find_available: coll component self is available
>> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>> [Metropolis-01:24491] coll:base:comm_select: new communicator: MPI_COMM_WORLD (cid 0)
>> [Metropolis-01:24491] coll:base:comm_select: Checking all available modules
>> [Metropolis-01:24491] coll:tuned:module_tuned query called
>> [Metropolis-01:24491] coll:base:comm_select: component available: tuned, priority: 30
>> [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, priority: 10
>> [Metropolis-01:24491] coll:base:comm_select: component not available: hierarch
>> [Metropolis-01:24491] coll:base:comm_select: component available: basic, priority: 10
>> [Metropolis-01:24491] coll:base:comm_select: component not available: inter
>> [Metropolis-01:24491] coll:base:comm_select: component not available: self
>> [Metropolis-01:24491] coll:tuned:module_init called.
>> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use
>> [Metropolis-01:24491] coll:base:comm_select: new communicator: MPI_COMM_SELF (cid 1)
>> [Metropolis-01:24491] coll:base:comm_select: Checking all available modules
>> [Metropolis-01:24491] coll:tuned:module_tuned query called
>> [Metropolis-01:24491] coll:base:comm_select: component not available: tuned
>> [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, priority: 10
>> [Metropolis-01:24491] coll:base:comm_select: component not available: hierarch
>> [Metropolis-01:24491] coll:base:comm_select: component available: basic, priority: 10
>> [Metropolis-01:24491] coll:base:comm_select: component not available: inter
>> [Metropolis-01:24491] coll:base:comm_select: component available: self, priority: 75
>> [Metropolis-01:24492] coll:base:comm_select: new communicator: MPI_COMM_WORLD (cid 0)
>> [Metropolis-01:24492] coll:base:comm_select: Checking all available modules
>> [Metropolis-01:24492] coll:tuned:module_tuned query called
>> [Metropolis-01:24492] coll:base:comm_select: component available: tuned, priority: 30
>> [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, priority: 10
>> [Metropolis-01:24492] coll:base:comm_select: component not available: hierarch
>> [Metropolis-01:24492] coll:base:comm_select: component available: basic, priority: 10
>> [Metropolis-01:24492] coll:base:comm_select: component not available: inter
>> [Metropolis-01:24492] coll:base:comm_select: component not available: self
>> [Metropolis-01:24492] coll:tuned:module_init called.
>> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use
>> [Metropolis-01:24492] coll:base:comm_select: new communicator: MPI_COMM_SELF (cid 1)
>> [Metropolis-01:24492] coll:base:comm_select: Checking all available modules
>> [Metropolis-01:24492] coll:tuned:module_tuned query called
>> [Metropolis-01:24492] coll:base:comm_select: component not available: tuned
>> [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, priority: 10
>> [Metropolis-01:24492] coll:base:comm_select: component not available: hierarch
>> [Metropolis-01:24492] coll:base:comm_select: component available: basic, priority: 10
>> [Metropolis-01:24492] coll:base:comm_select: component not available: inter
>> [Metropolis-01:24492] coll:base:comm_select: component available: self, priority: 75
>> [Metropolis-01:24491] coll:tuned:component_close: called
>> [Metropolis-01:24491] coll:tuned:component_close: done!
>> [Metropolis-01:24492] coll:tuned:component_close: called
>> [Metropolis-01:24492] coll:tuned:component_close: done!
>> [Metropolis-01:24492] mca: base: close: component tuned closed
>> [Metropolis-01:24492] mca: base: close: unloading component tuned
>> [Metropolis-01:24492] mca: base: close: component libnbc closed
>> [Metropolis-01:24492] mca: base: close: unloading component libnbc
>> [Metropolis-01:24492] mca: base: close: unloading component hierarch
>> [Metropolis-01:24492] mca: base: close: unloading component basic
>> [Metropolis-01:24492] mca: base: close: unloading component inter
>> [Metropolis-01:24492] mca: base: close: unloading component self
>> [Metropolis-01:24491] mca: base: close: component tuned closed
>> [Metropolis-01:24491] mca: base: close: unloading component tuned
>> [Metropolis-01:24491] mca: base: close: component libnbc closed
>> [Metropolis-01:24491] mca: base: close: unloading component libnbc
>> [Metropolis-01:24491] mca: base: close: unloading component hierarch
>> [Metropolis-01:24491] mca: base: close: unloading component basic
>> [Metropolis-01:24491] mca: base: close: unloading component inter
>> [Metropolis-01:24491] mca: base: close: unloading component self
>> [jarico_at_Metropolis-01 examples]$
>>
>>
>> SM is not load because it detects no other processes in the same machine:
>>
>> [Metropolis-01:24491] coll:sm:init_query: no other local procs; disqualifying myself
>>
>> The machine is a multicore machine with 8 cores.
>>
>> I need to run SM component code, and I suppose that raising priority it will be the component selected when problem is solved.
>>
>>
>>
>> El 03/07/2012, a las 21:01, Jeff Squyres escribió:
>>
>>> The issue is that the "sm" coll component only implements a few of the MPI collective operations. It is usually mixed at run-time with other coll components to fill out the rest of the MPI collective operations.
>>>
>>> So what is happening is that OMPI is determining that it doesn't have implementations of all the MPI collective operations and aborting.
>>>
>>> You shouldn't need to manually select your coll module -- OMPI should automatically select the right collective module for you. E.g., if all procs are local on a single machine and sm has a matching implementation for that MPI collective operation, it'll be used.
>>>
>>>
>>>
>>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote:
>>>
>>>> Output is:
>>>>
>>>> [Metropolis-01:15355] hwloc:base:get_topology
>>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root available cpuset
>>>>
>>>> ======================== JOB MAP ========================
>>>>
>>>> Data for node: Metropolis-01 Num procs: 2
>>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 0
>>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 1
>>>>
>>>> =============================================================
>>>> [Metropolis-01:15356] locality: CL:CU:N:B
>>>> [Metropolis-01:15356] hwloc:base: get available cpus
>>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time - filtering cpus
>>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root available cpuset
>>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object
>>>> [Metropolis-01:15357] locality: CL:CU:N:B
>>>> [Metropolis-01:15357] hwloc:base: get available cpus
>>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time - filtering cpus
>>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root available cpuset
>>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object
>>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>>>
>>>>
>>>> Regards,
>>>> Juan A. Rico
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel