Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node
From: tmishima_at_[hidden]
Date: 2013-12-25 06:39:15


Hi Ralph,

I did valgrind and found uninitialised value errors. All of them
occured in opal_tree_add_child as shown at the bottom. As a quick
fix, I puted one line in "opal_tree.c", although it's not elegant:

void opal_tree_init(opal_tree_t *tree, opal_tree_comp_fn_t comp,
                    opal_tree_item_serialize_fn_t serialize,
                    opal_tree_item_deserialize_fn_t deserialize,
                    opal_tree_get_key_fn_t get_key)
{
    tree->comp = comp;
    tree->serialize = serialize;
    tree->deserialize = deserialize;
    tree->get_key = get_key;
    opal_tree_get_root(tree)->opal_tree_num_children = 0 ; /* added by
tmishima */
}

Then, these errors all disappeared and openmpi with lama worked fine.
As I told you before, I built openmpi with PGI 13.10. As far as I
checked, no error was detected by valgrind with openmpi built by
GNU compiler. Therefore, it might depend on compiler...
Anyway, I would like to ask you (or openmpi team) to continue
further investigation.

Regards,
Tetsuya Mishima

valgrind -v --error-limit=no --leak-check=yes --show-reachable=no mpirun
-np 1 -mca rmaps lama -report-bindings -mca rmaps_base_verbose 100
--display-map ~/Desktop/openmpi-1.7/demos/myprog 2>&1 | tee valgrind.log

....
==27313== Conditional jump or move depends on uninitialised value(s)
==27313== at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
==27313== by 0x81E3314: rmaps_lama_convert_hwloc_subtree
(rmaps_lama_max_tree.c:320)
==27313== by 0x81E321D: rmaps_lama_convert_hwloc_tree_to_opal_tree
(rmaps_lama_max_tree.c:267)
==27313== by 0x81E2EE8: rmaps_lama_build_max_tree
(rmaps_lama_max_tree.c:154)
==27313== by 0x81E0E58: orte_rmaps_lama_map_core
(rmaps_lama_module.c:664)
==27313== by 0x81E02D7: orte_rmaps_lama_map (rmaps_lama_module.c:303)
==27313== by 0x4C6468B: orte_rmaps_base_map_job
(rmaps_base_map_job.c:204)
==27313== by 0x4F094CC: event_process_active_single_queue (event.c:1366)
==27313== by 0x4F090D8: event_process_active (event.c:1434)
==27313== by 0x4F050FF: opal_libevent2021_event_base_loop (event.c:1645)
==27313== by 0x4079A6: orterun (orterun.c:1049)
==27313== by 0x40694A: main (main.c:13)
.....
==27313== Conditional jump or move depends on uninitialised value(s)
==27313== at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
==27313== by 0x4EC5D0E: deserialize_add_tree_item (opal_tree.c:496)
==27313== by 0x4EC5578: opal_tree_deserialize (opal_tree.c:524)
==27313== by 0x4EC5609: opal_tree_dup (opal_tree.c:544)
==27313== by 0x81E2FF6: rmaps_lama_build_max_tree
(rmaps_lama_max_tree.c:202)
==27313== by 0x81E0E58: orte_rmaps_lama_map_core
(rmaps_lama_module.c:664)
==27313== by 0x81E02D7: orte_rmaps_lama_map (rmaps_lama_module.c:303)
==27313== by 0x4C6468B: orte_rmaps_base_map_job
(rmaps_base_map_job.c:204)
==27313== by 0x4F094CC: event_process_active_single_queue (event.c:1366)
==27313== by 0x4F090D8: event_process_active (event.c:1434)
==27313== by 0x4F050FF: opal_libevent2021_event_base_loop (event.c:1645)
==27313== by 0x4079A6: orterun (orterun.c:1049)
....
==27313== Conditional jump or move depends on uninitialised value(s)
==27313== at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
==27313== by 0x4EC5D0E: deserialize_add_tree_item (opal_tree.c:496)
==27313== by 0x4EC5578: opal_tree_deserialize (opal_tree.c:524)
==27313== by 0x4EC5609: opal_tree_dup (opal_tree.c:544)
==27313== by 0x81E2FF6: ???
==27313== by 0x81E0E58: ???
==27313== by 0x81E02D7: ???
==27313== by 0x4C6468B: orte_rmaps_base_map_job
(rmaps_base_map_job.c:204)
==27313== by 0x4F094CC: event_process_active_single_queue (event.c:1366)
==27313== by 0x4F090D8: event_process_active (event.c:1434)
==27313== by 0x4F050FF: opal_libevent2021_event_base_loop (event.c:1645)
==27313== by 0x4079A6: orterun (orterun.c:1049)
.....
==27313== Conditional jump or move depends on uninitialised value(s)
==27313== at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
==27313== by 0x81E3314: ???
==27313== by 0x81E321D: ???
==27313== by 0x81E2EE8: ???
==27313== by 0x81E0E58: ???
==27313== by 0x81E02D7: ???
==27313== by 0x4C6468B: orte_rmaps_base_map_job
(rmaps_base_map_job.c:204)
==27313== by 0x4F094CC: event_process_active_single_queue (event.c:1366)
==27313== by 0x4F090D8: event_process_active (event.c:1434)
==27313== by 0x4F050FF: opal_libevent2021_event_base_loop (event.c:1645)
==27313== by 0x4079A6: orterun (orterun.c:1049)
==27313== by 0x40694A: main (main.c:13)

> Hi Ralph,
>
> Here is the output when I put "-mca rmaps_base_verbose 10 --display-map"
> and where it stopped(by gdb), which shows it stopped in a function of
lama.
>
> I usually use PGI 13.10, so I tried to change it to gnu compiler.
> Then, it works. Therefore, this problem depends on compiler.
>
> That's all what I could find today.
>
> Regards,
> Tetsuya Mishima
>
> [mishima_at_manage ~]$ gdb
> GNU gdb (GDB) CentOS (7.0.1-42.el5.centos.1)
> ....
> (gdb) attach 14666
> ....
> 0x00002aaaab4c5c33 in rmaps_lama_prune_max_tree ()
> at ./rmaps_lama_max_tree.c:814
>
> [mishima_at_manage demos]$ mpirun -np 2 -mca rmaps lama -report-bindings
-mca
> rmaps_base_verbose 10 --display-map myprog
> [manage.cluster:21503] mca: base: components_register: registering rmaps
> components
> [manage.cluster:21503] mca: base: components_register: found loaded
> component lama
> [manage.cluster:21503] mca:rmaps:lama: Priority 0
> [manage.cluster:21503] mca:rmaps:lama: Map : NULL
> [manage.cluster:21503] mca:rmaps:lama: Bind : NULL
> [manage.cluster:21503] mca:rmaps:lama: MPPR : NULL
> [manage.cluster:21503] mca:rmaps:lama: Order : NULL
> [manage.cluster:21503] mca: base: components_register: component lama
> register function successful
> [manage.cluster:21503] mca: base: components_open: opening rmaps
components
> [manage.cluster:21503] mca: base: components_open: found loaded component
> lama
> [manage.cluster:21503] mca:rmaps:select: checking available component
lama
> [manage.cluster:21503] mca:rmaps:select: Querying component [lama]
> [manage.cluster:21503] [[23940,0],0]: Final mapper priorities
> [manage.cluster:21503] Mapper: lama Priority: 0
> [manage.cluster:21503] mca:rmaps: mapping job [23940,1]
> [manage.cluster:21503] mca:rmaps: creating new map for job [23940,1]
> [manage.cluster:21503] mca:rmaps: nprocs 2
> [manage.cluster:21503] mca:rmaps:lama: Mapping job [23940,1]
> [manage.cluster:21503] mca:rmaps:lama: Revised Parameters -----
> [manage.cluster:21503] mca:rmaps:lama: Map : csbnh
> [manage.cluster:21503] mca:rmaps:lama: Bind : 1c
> [manage.cluster:21503] mca:rmaps:lama: MPPR : (null)
> [manage.cluster:21503] mca:rmaps:lama: Order : s
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] mca:rmaps:lama: ----- Binding : [1c]
> [manage.cluster:21503] mca:rmaps:lama: ----- Binding : 1 x Core
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping : [csbnh]
> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping : (0) Core (7
> vs 0)
> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping : (1) Socket (3
> vs 1)
> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping : (2) Board (1
> vs 3)
> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping : (3) Machine (0
> vs 7)
> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping : (4) Hw. Thread (8
> vs 8)
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] mca:rmaps:lama: ----- MPPR : [(null)]
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] mca:rmaps:lama: ----- Ordering : [s]
> [manage.cluster:21503] mca:rmaps:lama: ----- Ordering : Sequential
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] AVAILABLE NODES FOR MAPPING:
> [manage.cluster:21503] node: manage daemon: 0
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] mca:rmaps:lama: ----- Building the Max Tree...
> [manage.cluster:21503] mca:rmaps:lama: ---------------------------------
> [manage.cluster:21503] mca:rmaps:lama: ----- Converting Remote Tree:
manage
>
> [mishima_at_manage demos]$ ompi_info | grep "C compiler family"
> C compiler family name: GNU
> [mishima_at_manage demos]$ mpirun -np 2 -mca rmaps lama myprog
> Hello world from process 0 of 2
> Hello world from process 1 of 2
>
>
>
> > On Dec 21, 2013, at 8:16 PM, tmishima_at_[hidden] wrote:
> >
> > >
> > >
> > > Ralph, thanks. I'll try it on Tuseday.
> > >
> > > Let me confirm one thing. I don't put "-with-libevent" when I build
> > > openmpi.
> > > Is there any possibility to build with external libevent
automatically?
> >
> > No - only happens if you direct it
> >
> >
> > >
> > > Tetsuya Mishima
> > >
> > >
> > >> Not entirely sure - add "-mca rmaps_base_verbose 10 --display-map"
to
> > > your cmd line and let's see if it finishes the mapping.
> > >>
> > >> Unless you specifically built with an external libevent (which I
> doubt),
> > > there is no conflict. The connection issue is unlikely to be a factor
> here
> > > as it works when not using the lama mapper.
> > >>
> > >>
> > >> On Dec 21, 2013, at 3:43 PM, tmishima_at_[hidden] wrote:
> > >>
> > >>>
> > >>>
> > >>> Thank you, Ralph.
> > >>>
> > >>> Then, this problem should depend on our environment.
> > >>> But, at least, inversion problem is not the cause because
> > >>> node05 has normal hier order.
> > >>>
> > >>> I can not connect to our cluster now. Tuesday, going
> > >>> back to my office, I'll send you further report.
> > >>>
> > >>> Before that, please let me know your configuration. I will
> > >>> follow your configuation as much as possible. Our configuraion
> > >>> is very simple, only -with-tm -with-ibverbs -disable-ipv6.
> > >>> (on CentOS 5.8)
> > >>>
> > >>> The 1.7 series is a llite bit unstable on our cluster yet.
> > >>>
> > >>> Similar freezing(hang up) was observed with 1.7.3. At that
> > >>> time, lama worked well but putting "-rank-by something" caused
> > >>> same freezing (curiously, rank-by works with 1.7.4rc1).
> > >>> I checked where it stopped using gdb, then I found that it
> > >>> stopped to wait for event in a function of libevent(I can not
> > >>> recall the name).
> > >>>
> > >>> Is this related to your "connection issue in the OOB
> > >>> subsystem"? Or libevent version conflict? I guess these two
> > >>> problems are related each other. They stopped at very early
> > >>> stage before reaching mapping function because no message
> > >>> appeared before freezing, which is my random guess.
> > >>>
> > >>> Could you give me any hint or comment?
> > >>>
> > >>> Regards,
> > >>> Tetsuya Mishima
> > >>>
> > >>>
> > >>>> It seems to be working fine for me:
> > >>>>
> > >>>> [rhc_at_bend001 tcp]$ mpirun -np 2 -host bend001 -report-bindings
-mca
> > >>> rmaps_lama_bind 1c -mca rmaps lama hostname
> > >>>> bend001
> > >>>> [bend001:17005] MCW rank 1 bound to socket 0[core 1[hwt 0-1]]:
> > >>> [../BB/../../../..][../../../../../..]
> > >>>> [bend001:17005] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]:
> > >>> [BB/../../../../..][../../../../../..]
> > >>>> bend001
> > >>>> [rhc_at_bend001 tcp]$
> > >>>>
> > >>>> (I also checked the internals using "-mca rmaps_base_verbose 10")
so
> > > it
> > >>> could be your hier inversion causing problems again. Or it could be
> > > that
> > >>> you are hitting a connection issue we are seeing in
> > >>>> some scenarios in the OOB subsystem - though if you are able to
run
> > > using
> > >>> a non-lama mapper, that would seem unlikely.
> > >>>>
> > >>>>
> > >>>> On Dec 20, 2013, at 8:09 PM, tmishima_at_[hidden] wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> Hi Ralph,
> > >>>>
> > >>>> Thank you very much. I tried many things such as:
> > >>>>
> > >>>> mpirun -np 2 -host node05 -report-bindings -mca rmaps lama -mca
> > >>>> rmaps_lama_bind 1c myprog
> > >>>>
> > >>>> But every try failed. At least they were accepted by openmpi-1.7.3
> as
> > > far
> > >>>> as I remember.
> > >>>> Anyway, please check it when you have a time, because using lama
> comes
> > >>> from
> > >>>> my curiosity.
> > >>>>
> > >>>> Regards,
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>>
> > >>>> I'll try to take a look at it - my expectation is that lama might
> get
> > >>>> stuck because you didn't tell it a pattern to map, and I doubt
that
> > > code
> > >>>> path has seen much testing.
> > >>>>
> > >>>>
> > >>>> On Dec 20, 2013, at 5:52 PM, tmishima_at_[hidden] wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> Hi Ralph, I'm glad to hear that, thanks.
> > >>>>
> > >>>> By the way, yesterday I tried to check how lama in 1.7.4rc treat
> numa
> > >>>> node.
> > >>>>
> > >>>> Then, even wiht this simple command line, it freezed without any
> > >>>> massage:
> > >>>>
> > >>>> mpirun -np 2 -host node05 -mca rmaps lama myprog
> > >>>>
> > >>>> Could you check what happened?
> > >>>>
> > >>>> Is it better to open new thread or continue this thread?
> > >>>>
> > >>>> Regards,
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>>
> > >>>> I'll make it work so that NUMA can be either above or below socket
> > >>>>
> > >>>> On Dec 20, 2013, at 2:57 AM, tmishima_at_[hidden] wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> Hi Brice,
> > >>>>
> > >>>> Thank you for your comment. I understand what you mean.
> > >>>>
> > >>>> My opinion was made just considering easy way to adjust the code
for
> > >>>> inversion of hierarchy in object tree.
> > >>>>
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>>
> > >>>> I don't think there's any such difference.
> > >>>> Also, all these NUMA architectures are reported the same by hwloc,
> > >>>> and
> > >>>> therefore used the same in Open MPI.
> > >>>>
> > >>>> And yes, L3 and NUMA are topologically-identical on AMD
Magny-Cours
> > >>>> (and
> > >>>> most recent AMD and Intel platforms).
> > >>>>
> > >>>> Brice
> > >>>>
> > >>>>
> > >>>>
> > >>>> Le 20/12/2013 11:33, tmishima_at_[hidden] a écrit :
> > >>>>
> > >>>> Hi Ralph,
> > >>>>
> > >>>> The numa-node in AMD Mangy-Cours/Interlagos is so called cc(cache
> > >>>> coherent)NUMA,
> > >>>> which seems to be a little bit different from the traditional numa
> > >>>> defined
> > >>>> in openmpi.
> > >>>>
> > >>>> I notice that ccNUMA object is almost same as L3cache object.
> > >>>> So "-bind-to l3cache" or "-map-by l3cache" is valid for what I
want
> > >>>> to
> > >>>> do.
> > >>>> Therefore, "do not touch it" is one of the solution, I think ...
> > >>>>
> > >>>> Anyway, mixing up these two types of numa is the problem.
> > >>>>
> > >>>> Regards,
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>> I can wait it'll be fixed in 1.7.5 or later, because putting
> > >>>> "-bind-to
> > >>>> numa"
> > >>>> and "-map-by numa" at the same time works as a workaround.
> > >>>>
> > >>>> Thanks,
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>> Yeah, it will impact everything that uses hwloc topology maps, I
> > >>>> fear.
> > >>>>
> > >>>> One side note: you'll need to add --hetero-nodes to your cmd
> > >>>> line.
> > >>>> If
> > >>>> we
> > >>>> don't see that, we assume that all the node topologies are
> > >>>> identical
> > >>>> -
> > >>>> which clearly isn't true here.
> > >>>> I'll try to resolve the hier inversion over the holiday - won't
> > >>>> be
> > >>>> for
> > >>>> 1.7.4, but hopefully for 1.7.5
> > >>>> Thanks
> > >>>> Ralph
> > >>>>
> > >>>> On Dec 18, 2013, at 9:44 PM, tmishima_at_[hidden] wrote:
> > >>>>
> > >>>>
> > >>>> I think it's normal for AMD opteron having 8/16 cores such as
> > >>>> magny cours or interlagos. Because it usually has 2 numa nodes
> > >>>> in a cpu(socket), numa-node can not include a socket. This type
> > >>>> of hierarchy would be natural.
> > >>>>
> > >>>> (node03 is Dell PowerEdge R815 and maybe quite common, I guess)
> > >>>>
> > >>>> By the way, I think this inversion should affect rmaps_lama
> > >>>> mapping.
> > >>>>
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>> Ick - yeah, that would be a problem. I haven't seen that type
> > >>>> of
> > >>>> hierarchical inversion before - is node03 a different type of
> > >>>> chip?
> > >>>> Might take awhile for me to adjust the code to handle hier
> > >>>> inversion... :-(
> > >>>> On Dec 18, 2013, at 9:05 PM, tmishima_at_[hidden] wrote:
> > >>>>
> > >>>>
> > >>>> Hi Ralph,
> > >>>>
> > >>>> I found the reason. I attached the main part of output with 32
> > >>>> core node(node03) and 8 core node(node05) at the bottom.
> > >>>>
> > >>>> From this information, socket of node03 includes numa-node.
> > >>>> On the other hand, numa-node of node05 includes socket.
> > >>>> The direction of object tree is opposite.
> > >>>>
> > >>>> Since "-map-by socket" may be assumed as default,
> > >>>> for node05, "-bind-to numa and -map-by socket" means
> > >>>> upward search. For node03, this should be downward.
> > >>>>
> > >>>> I guess that openmpi-1.7.4rc1 will always assume numa-node
> > >>>> includes socket. Is it right? Then, upward search is assumed
> > >>>> in orte_rmaps_base_compute_bindings even for node03 when I
> > >>>> put "-bind-to numa and -map-by socket" option.
> > >>>>
> > >>>> [node03.cluster:15508] [[38286,0],0] rmaps:base:compute_usage
> > >>>> [node03.cluster:15508] mca:rmaps: compute bindings for job
> > >>>> [38286,1]
> > >>>> with
> > >>>> policy NUMA
> > >>>> [node03.cluster:15508] mca:rmaps: bind upwards for job
> > >>>> [38286,1]
> > >>>> with
> > >>>> bindings NUMA
> > >>>> [node03.cluster:15508] [[38286,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Machine
> > >>>>
> > >>>> That's the reason of this trouble. Therefore, adding "-map-by
> > >>>> core"
> > >>>> works.
> > >>>> (mapping pattern seems to be strange ...)
> > >>>>
> > >>>> [mishima_at_node03 demos]$ mpirun -np 8 -bind-to numa -map-by
> > >>>> core
> > >>>> -report-bindings myprog
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode> >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Cache
> > >>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> NUMANode
> > >>>> [node03.cluster:15885] MCW rank 2 bound to socket 0[core 0[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > >>>> cket 0[core 3[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 3 bound to socket 0[core 0[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > >>>> cket 0[core 3[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 4 bound to socket 0[core 4[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> > >>>> cket 0[core 7[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 5 bound to socket 0[core 4[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> > >>>> cket 0[core 7[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 6 bound to socket 0[core 4[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> > >>>> cket 0[core 7[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 7 bound to socket 0[core 4[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> > >>>> cket 0[core 7[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 0 bound to socket 0[core 0[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > >>>> cket 0[core 3[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> > >>>> [node03.cluster:15885] MCW rank 1 bound to socket 0[core 0[hwt
> > >>>> 0]],
> > >>>> socket
> > >>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > >>>> cket 0[core 3[hwt 0]]:
> > >>>>
> > >>>>
> > >>>>
[B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> > >>>> Hello world from process 6 of 8
> > >>>> Hello world from process 5 of 8
> > >>>> Hello world from process 0 of 8
> > >>>> Hello world from process 7 of 8
> > >>>> Hello world from process 3 of 8
> > >>>> Hello world from process 4 of 8
> > >>>> Hello world from process 2 of 8
> > >>>> Hello world from process 1 of 8
> > >>>>
> > >>>> Regards,
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>> [node03.cluster:15508] Type: Machine Number of child objects:
> > >>>> 4
> > >>>> Name=NULL
> > >>>> total=132358820KB
> > >>>> Backend=Linux
> > >>>> OSName=Linux
> > >>>> OSRelease=2.6.18-308.16.1.el5
> > >>>> OSVersion="#1 SMP Tue Oct 2 22:01:43 EDT 2012"
> > >>>> Architecture=x86_64
> > >>>> Cpuset: 0xffffffff
> > >>>> Online: 0xffffffff
> > >>>> Allowed: 0xffffffff
> > >>>> Bind CPU proc: TRUE
> > >>>> Bind CPU thread: TRUE
> > >>>> Bind MEM proc: FALSE
> > >>>> Bind MEM thread: TRUE
> > >>>> Type: Socket Number of child objects: 2
> > >>>> Name=NULL
> > >>>> total=33071780KB
> > >>>> CPUModel="AMD Opteron(tm) Processor 6136"
> > >>>> Cpuset: 0x000000ff
> > >>>> Online: 0x000000ff
> > >>>> Allowed: 0x000000ff
> > >>>> Type: NUMANode Number of child objects: 1
> > >>>>
> > >>>>
> > >>>> [node05.cluster:21750] Type: Machine Number of child objects:
> > >>>> 2
> > >>>> Name=NULL
> > >>>> total=33080072KB
> > >>>> Backend=Linux>>>> OSName=Linux
> > >>>> OSRelease=2.6.18-308.16.1.el5
> > >>>> OSVersion="#1 SMP Tue Oct 2 22:01:43 EDT 2012"
> > >>>> Architecture=x86_64
> > >>>> Cpuset: 0x000000ff
> > >>>> Online: 0x000000ff
> > >>>> Allowed: 0x000000ff
> > >>>> Bind CPU proc: TRUE
> > >>>> Bind CPU thread: TRUE
> > >>>> Bind MEM proc: FALSE
> > >>>> Bind MEM thread: TRUE
> > >>>> Type: NUMANode Number of child objects: 1
> > >>>> Name=NULL
> > >>>> local=16532232KB
> > >>>> total=16532232KB
> > >>>> Cpuset: 0x0000000f
> > >>>> Online: 0x0000000f
> > >>>> Allowed: 0x0000000f
> > >>>> Type: Socket Number of child objects: 1
> > >>>>
> > >>>>
> > >>>> Hmm...try adding "-mca rmaps_base_verbose 10 -mca
> > >>>> ess_base_verbose
> > >>>> 5"
> > >>>> to
> > >>>> your cmd line and let's see what it thinks it found.
> > >>>>
> > >>>> On Dec 18, 2013, at 6:55 PM, tmishima_at_[hidden]
> > >>>> wrote:
> > >>>>
> > >>>>
> > >>>> Hi, I report one more problem with openmpi-1.7.4rc1,
> > >>>> which is more serious.
> > >>>>
> > >>>> For our 32 core nodes(AMD magny cours based) which has
> > >>>> 8 numa-nodes, "-bind-to numa" does not work. Without
> > >>>> this option, it works. For your infomation, at the
> > >>>> bottom of this mail, I added the lstopo information
> > >>>> of the node.
> > >>>>
> > >>>> Regards,
> > >>>> Tetsuya Mishima
> > >>>>
> > >>>> [mishima_at_manage ~]$ qsub -I -l nodes=1:ppn=32>> qsub: waiting for
> job
> > > 8352.manage.cluster to start
> > >>>> qsub: job 8352.manage.cluster ready
> > >>>>
> > >>>> [mishima_at_node03 demos]$ mpirun -np 8 -report-bindings
> > >>>> -bind-to
> > >>>> numa
> > >>>> myprog
> > >>>> [node03.cluster:15316] [[37582,0],0] bind:upward target
> > >>>> NUMANode
> > >>>> type
> > >>>> Machine
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >
>
--------------------------------------------------------------------------
> > >>>> A request was made to bind to NUMA, but an appropriate
> > >>>> target
> > >>>> could
> > >>>> not
> > >>>> be found on node node03.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >
>
--------------------------------------------------------------------------
> > >>>> [mishima_at_node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> > >>>> [mishima_at_node03 demos]$ mpirun -np 8 -report-bindings myprog
> > >>>> [node03.cluster:15282] MCW rank 2 bound to socket 1[core 8
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [./././././././.][B/././././././.][./././././././.][
> > >>>> ./././././././.]>>>>>>>>>>>> [node03.cluster:15282] MCW rank
> > >>>> 3 bound to socket 1[core 9[hwt
> > >>>> 0]]:
> > >>>> [./././././././.][./B/./././././.][./././././././.][
> > >>>> ./././././././.]
> > >>>> [node03.cluster:15282] MCW rank 4 bound to socket 2[core 16
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [./././././././.][./././././././.][B/././././././.]
> > >>>> [./././././././.]
> > >>>> [node03.cluster:15282] MCW rank 5 bound to socket 2[core 17
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [./././././././.][./././././././.][./B/./././././.]
> > >>>> [./././././././.]
> > >>>> [node03.cluster:15282] MCW rank 6 bound to socket 3[core 24
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [./././././././.][./././././././.][./././././././.]
> > >>>> [B/././././././.]
> > >>>> [node03.cluster:15282] MCW rank 7 bound to socket 3[core 25
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [./././././././.][./././././././.][./././././././.]
> > >>>> [./B/./././././.]
> > >>>> [node03.cluster:15282] MCW rank 0 bound to socket 0[core 0
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [B/././././././.][./././././././.][./././././././.][
> > >>>> ./././././././.]
> > >>>> [node03.cluster:15282] MCW rank 1 bound to socket 0[core 1
> > >>>> [hwt
> > >>>> 0]]:
> > >>>> [./B/./././././.][./././././././.][./././././././.][
> > >>>> ./././././././.]
> > >>>> Hello world from process 2 of 8
> > >>>> Hello world from process 5 of 8
> > >>>> Hello world from process 4 of 8
> > >>>> Hello world from process 3 of 8>>>>>>>>>> Hello world from
> > >>>> process 1 of 8
> > >>>> Hello world from process 7 of 8
> > >>>> Hello world from process 6 of 8
> > >>>> Hello world from process 0 of 8
> > >>>> [mishima_at_node03 demos]$ ~/opt/hwloc/bin/lstopo-no-graphics
> > >>>> Machine (126GB)
> > >>>> Socket L#0 (32GB)
> > >>>> NUMANode L#0 (P#0 16GB) + L3 L#0 (5118KB)
> > >>>> L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0
> > >>>> +
> > >>>> PU
> > >>>> L#0
> > >>>> (P#0)
> > >>>> L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1
> > >>>> +
> > >>>> PU
> > >>>> L#1
> > >>>> (P#1)
> > >>>> L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2
> > >>>> +
> > >>>> PU
> > >>>> L#2
> > >>>> (P#2)
> > >>>> L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3
> > >>>> +
> > >>>> PU
> > >>>> L#3
> > >>>> (P#3)
> > >>>> NUMANode L#1 (P#1 16GB) + L3 L#1 (5118KB)
> > >>>> L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4
> > >>>> +
> > >>>> PU
> > >>>> L#4
> > >>>> (P#4)
> > >>>> L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5
> > >>>> +
> > >>>> PU
> > >>>> L#5
> > >>>> (P#5)
> > >>>> L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6
> > >>>> +
> > >>>> PU
> > >>>> L#6
> > >>>> (P#6)
> > >>>> L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7
> > >>>> +
> > >>>> PU>>>>>> L#7
> > >>>> (P#7)
> > >>>> Socket L#1 (32GB)
> > >>>> NUMANode L#2 (P#6 16GB) + L3 L#2 (5118KB)
> > >>>> L2 L#8 (512KB) + L1d L#8 (64KB) + L1i L#8 (64KB) + Core L#8
> > >>>> +
> > >>>> PU
> > >>>> L#8
> > >>>> (P#8)
> > >>>> L2 L#9 (512KB) + L1d L#9 (64KB) + L1i L#9 (64KB) + Core L#9
> > >>>> +
> > >>>> PU
> > >>>> L#9
> > >>>> (P#9)
> > >>>> L2 L#10 (512KB) + L1d L#10 (64KB) + L1i L#10 (64KB) + Core
> > >>>> L#10
> > >>>> +
> > >>>> PU
> > >>>> L#10 (P#10)
> > >>>> L2 L#11 (512KB) + L1d L#11 (64KB) + L1i L#11 (64KB) + Core
> > >>>> L#11
> > >>>> +
> > >>>> PU
> > >>>> L#11 (P#11)
> > >>>> NUMANode L#3 (P#7 16GB) + L3 L#3 (5118KB)
> > >>>> L2 L#12 (512KB) + L1d L#12 (64KB) + L1i L#12 (64KB) + Core
> > >>>> L#12
> > >>>> +
> > >>>> PU
> > >>>> L#12 (P#12)
> > >>>> L2 L#13 (512KB) + L1d L#13 (64KB) + L1i L#13 (64KB) + Core
> > >>>> L#13
> > >>>> +
> > >>>> PU
> > >>>> L#13 (P#13)
> > >>>> L2 L#14 (512KB) + L1d L#14 (64KB) + L1i L#14 (64KB) + Core
> > >>>> L#14
> > >>>> +
> > >>>> PU
> > >>>> L#14 (P#14)
> > >>>> L2 L#15 (512KB) + L1d L#15 (64KB) + L1i L#15 (64KB) + Core
> > >>>> L#15
> > >>>> +
> > >>>> PU
> > >>>> L#15 (P#15)
> > >>>> Socket L#2 (32GB)
> > >>>> NUMANode L#4 (P#4 16GB) + L3 L#4 (5118KB)
> > >>>> L2 L#16 (512KB) + L1d L#16 (64KB) + L1i L#16 (64KB) + Core
> > >>>> L#16
> > >>>> +
> > >>>> PU
> > >>>> L#16 (P#16)
> > >>>> L2 L#17 (512KB) + L1d L#17 (64KB) + L1i L#17 (64KB) + Core
> > >>>> L#17
> > >>>> +
> > >>>> PU
> > >>>> L#17 (P#17)> >>>>> L2 L#18 (512KB) + L1d L#18 (64KB) +
> > >>>> L1i
> > >>>> L#18 (64KB) + Core L#18
> > >>>> +
> > >>>> PU
> > >>>> L#18 (P#18)
> > >>>> L2 L#19 (512KB) + L1d L#19 (64KB) + L1i L#19 (64KB) + Core
> > >>>> L#19
> > >>>> +
> > >>>> PU
> > >>>> L#19 (P#19)
> > >>>> NUMANode L#5 (P#5 16GB) + L3 L#5 (5118KB)
> > >>>> L2 L#20 (512KB) + L1d L#20 (64KB) + L1i L#20 (64KB) + Core
> > >>>> L#20
> > >>>> +
> > >>>> PU
> > >>>> L#20 (P#20)
> > >>>> L2 L#21 (512KB) + L1d L#21 (64KB) + L1i L#21 (64KB) + Core
> > >>>> L#21
> > >>>> +
> > >>>> PU
> > >>>> L#21 (P#21)
> > >>>> L2 L#22 (512KB) + L1d L#22 (64KB) + L1i L#22 (64KB) + Core
> > >>>> L#22
> > >>>> +
> > >>>> PU
> > >>>> L#22 (P#22)
> > >>>> L2 L#23 (512KB) + L1d L#23 (64KB) + L1i L#23 (64KB) + Core
> > >>>> L#23
> > >>>> +
> > >>>> PU
> > >>>> L#23 (P#23)
> > >>>> Socket L#3 (32GB)
> > >>>> NUMANode L#6 (P#2 16GB) + L3 L#6 (5118KB)
> > >>>> L2 L#24 (512KB) + L1d L#24 (64KB) + L1i L#24 (64KB) + Core
> > >>>> L#24
> > >>>> +
> > >>>> PU
> > >>>> L#24 (P#24)>>>>> L2 L#25 (512KB) + L1d L#25 (64KB) + L1i
> > >>>> L#25
> > >>>> (64KB) + Core L#25 +
> > >>>> PU
> > >>>> L#25 (P#25)
> > >>>> L2 L#26 (512KB) + L1d L#26 (64KB) + L1i L#26 (64KB) + Core
> > >>>> L#26
> > >>>> +
> > >>>> PU
> > >>>> L#26 (P#26)
> > >>>> L2 L#27 (512KB) + L1d L#27 (64KB) + L1i L#27 (64KB) + Core
> > >>>> L#27
> > >>>> +
> > >>>> PU
> > >>>> L#27 (P#27)
> > >>>> NUMANode L#7 (P#3 16GB) + L3 L#7 (5118KB)
> > >>>> L2 L#28 (512KB) + L1d L#28 (64KB) + L1i L#28 (64KB) + Core
> > >>>> L#28
> > >>>> +
> > >>>> PU
> > >>>> L#28 (P#28)
> > >>>> L2 L#29 (512KB) + L1d L#29 (64KB) + L1i L#29 (64KB) + Core
> > >>>> L#29
> > >>>> +
> > >>>> PU
> > >>>> L#29 (P#29)
> > >>>> L2 L#30 (512KB) + L1d L#30 (64KB) + L1i L#30 (64KB) + Core
> > >>>> L#30
> > >>>> +
> > >>>> PU
> > >>>> L#30 (P#30)
> > >>>> L2 L#31 (512KB) + L1d L#31 (64KB) + L1i L#31 (64KB) + Core
> > >>>> L#31
> > >>>> +
> > >>>> PU
> > >>>> L#31 (P#31)
> > >>>> HostBridge L#0
> > >>>> PCIBridge
> > >>>> PCI 14e4:1639
> > >>>> Net L#0 "eth0"
> > >>>> PCI 14e4:1639
> > >>>> Net L#1 "eth1"
> > >>>> PCIBridge
> > >>>> PCI 14e4:1639
> > >>>> Net L#2 "eth2"
> > >>>> PCI 14e4:1639
> > >>>> Net L#3 "eth3"
> > >>>> PCIBridge
> > >>>> PCIBridge
> > >>>> PCIBridge
> > >>>> PCI 1000:0072
> > >>>> Block L#4 "sdb"
> > >>>> Block L#5 "sda"
> > >>>> PCI 1002:4390
> > >>>> Block L#6 "sr0"
> > >>>> PCIBridge
> > >>>> PCI 102b:0532
> > >>>> HostBridge L#7
> > >>>> PCIBridge
> > >>>> PCI 15b3:6274
> > >>>> Net L#7 "ib0"
> > >>>> OpenFabrics L#8 "mthca0"
> > >>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]>>
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________________________> >>>> users
mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________
________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>>
> > >>>
> > >
>
http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________

>
> > >
> > >>>
> > >>>> users mailing list
> > >>>>
users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>
> > >>> _______________________________________________
> > >>> users mailing list
> > >>> users_at_[hidden]
> > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>
> > >> _______________________________________________
> > >> users mailing list
> > >> users_at_[hidden]
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > > _______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users