Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] [openib] segfault when using openib btl
From: Terry Dontje (terry.dontje_at_[hidden])
Date: 2010-09-27 05:43:55


So it sounds like coalescing is not your issue and that the problem has
something to do with the queue sizes. It would be helpful if we could
detect the hdr->tag == 0 issue on the sending side and get at least a
stack trace. There is something really odd going on here.

--td

Eloi Gaudry wrote:
> Hi Terry,
>
> I'm sorry to say that I might have missed a point here.
>
> I've lately been relaunching all previously failing computations with
> the message coalescing feature being switched off, and I saw the same
> hdr->tag=0 error several times, always during a collective call
> (MPI_Comm_create, MPI_Allreduce and MPI_Broadcast, so far). And as
> soon as I switched to the peer queue option I was previously using
> (--mca btl_openib_receive_queues P,65536,256,192,128 instead of using
> --mca btl_openib_use_message_coalescing 0), all computations ran
> flawlessly.
>
> As for the reproducer, I've already tried to write something but I
> haven't succeeded so far at reproducing the hdr->tag=0 issue with it.
>
> Eloi
>
> On 24/09/2010 18:37, Terry Dontje wrote:
>> Eloi Gaudry wrote:
>>> Terry,
>>>
>>> You were right, the error indeed seems to come from the message coalescing feature.
>>> If I turn it off using the "--mca btl_openib_use_message_coalescing 0", I'm not able to observe the "hdr->tag=0" error.
>>>
>>> There are some trac requests associated to very similar error (https://svn.open-mpi.org/trac/ompi/search?q=coalescing) but they are all closed (except https://svn.open-mpi.org/trac/ompi/ticket/2352
>>> that might be related), aren't they ? What would you suggest Terry ?
>>>
>>>
>> Interesting, though it looks to me like the segv in ticket 2352 would
>> have happened on the send side instead of the receive side like you
>> have. As to what to do next it would be really nice to have some
>> sort of reproducer that we can try and debug what is really going
>> on. The only other thing to do without a reproducer is to inspect
>> the code on the send side to figure out what might make it generate
>> at 0 hdr->tag. Or maybe instrument the send side to stop when it is
>> about ready to send a 0 hdr->tag and see if we can see how the code
>> got there.
>>
>> I might have some cycles to look at this Monday.
>>
>> --td
>>> Eloi
>>>
>>>
>>> On Friday 24 September 2010 16:00:26 Terry Dontje wrote:
>>>
>>>> Eloi Gaudry wrote:
>>>>
>>>>> Terry,
>>>>>
>>>>> No, I haven't tried any other values than P,65536,256,192,128 yet.
>>>>>
>>>>> The reason why is quite simple. I've been reading and reading again this
>>>>> thread to understand the btl_openib_receive_queues meaning and I can't
>>>>> figure out why the default values seem to induce the hdr-
>>>>>
>>>>>
>>>>>> tag=0 issue
>>>>>> (http://www.open-mpi.org/community/lists/users/2009/01/7808.php).
>>>>>>
>>>> Yeah, the size of the fragments and number of them really should not
>>>> cause this issue. So I too am a little perplexed about it.
>>>>
>>>>
>>>>> Do you think that the default shared received queue parameters are
>>>>> erroneous for this specific Mellanox card ? Any help on finding the
>>>>> proper parameters would actually be much appreciated.
>>>>>
>>>> I don't necessarily think it is the queue size for a specific card but
>>>> more so the handling of the queues by the BTL when using certain sizes.
>>>> At least that is one gut feel I have.
>>>>
>>>> In my mind the tag being 0 is either something below OMPI is polluting
>>>> the data fragment or OMPI's internal protocol is some how getting messed
>>>> up. I can imagine (no empirical data here) the queue sizes could change
>>>> how the OMPI protocol sets things up. Another thing may be the
>>>> coalescing feature in the openib BTL which tries to gang multiple
>>>> messages into one packet when resources are running low. I can see
>>>> where changing the queue sizes might affect the coalescing. So, it
>>>> might be interesting to turn off the coalescing. You can do that by
>>>> setting "--mca btl_openib_use_message_coalescing 0" in your mpirun line.
>>>>
>>>> If that doesn't solve the issue then obviously there must be something
>>>> else going on :-).
>>>>
>>>> Note, the reason I am interested in this is I am seeing a similar error
>>>> condition (hdr->tag == 0) on a development system. Though my failing
>>>> case fails with np=8 using the connectivity test program which is mainly
>>>> point to point and there are not a significant amount of data transfers
>>>> going on either.
>>>>
>>>> --td
>>>>
>>>>
>>>>> Eloi
>>>>>
>>>>> On Friday 24 September 2010 14:27:07 you wrote:
>>>>>
>>>>>> That is interesting. So does the number of processes affect your runs
>>>>>> any. The times I've seen hdr->tag be 0 usually has been due to protocol
>>>>>> issues. The tag should never be 0. Have you tried to do other
>>>>>> receive_queue settings other than the default and the one you mention.
>>>>>>
>>>>>> I wonder if you did a combination of the two receive queues causes a
>>>>>> failure or not. Something like
>>>>>>
>>>>>> P,128,256,192,128:P,65536,256,192,128
>>>>>>
>>>>>> I am wondering if it is the first queuing definition causing the issue
>>>>>> or possibly the SRQ defined in the default.
>>>>>>
>>>>>> --td
>>>>>>
>>>>>> Eloi Gaudry wrote:
>>>>>>
>>>>>>> Hi Terry,
>>>>>>>
>>>>>>> The messages being send/received can be of any size, but the error
>>>>>>> seems to happen more often with small messages (as an int being
>>>>>>> broadcasted or allreduced). The failing communication differs from one
>>>>>>> run to another, but some spots are more likely to be failing than
>>>>>>> another. And as far as I know, there are always located next to a
>>>>>>> small message (an int being broadcasted for instance) communication.
>>>>>>> Other typical messages size are
>>>>>>>
>>>>>>>
>>>>>>>> 10k but can be very much larger.
>>>>>>>>
>>>>>>> I've been checking the hca being used, its' from mellanox (with
>>>>>>> vendor_part_id=26428). There is no receive_queues parameters associated
>>>>>>> to it.
>>>>>>>
>>>>>>> $ cat share/openmpi/mca-btl-openib-device-params.ini as well:
>>>>>>> [...]
>>>>>>>
>>>>>>> # A.k.a. ConnectX
>>>>>>> [Mellanox Hermon]
>>>>>>> vendor_id = 0x2c9,0x5ad,0x66a,0x8f1,0x1708,0x03ba,0x15b3
>>>>>>> vendor_part_id =
>>>>>>> 25408,25418,25428,26418,26428,25448,26438,26448,26468,26478,26488
>>>>>>> use_eager_rdma = 1
>>>>>>> mtu = 2048
>>>>>>> max_inline_data = 128
>>>>>>>
>>>>>>> [..]
>>>>>>>
>>>>>>> $ ompi_info --param btl openib --parsable | grep receive_queues
>>>>>>>
>>>>>>> mca:btl:openib:param:btl_openib_receive_queues:value:P,128,256,192,128
>>>>>>> :S ,2048,256,128,32:S,12288,256,128,32:S,65536,256,128,32
>>>>>>> mca:btl:openib:param:btl_openib_receive_queues:data_source:default
>>>>>>> value mca:btl:openib:param:btl_openib_receive_queues:status:writable
>>>>>>> mca:btl:openib:param:btl_openib_receive_queues:help:Colon-delimited,
>>>>>>> comma delimited list of receive queues: P,4096,8,6,4:P,32768,8,6,4
>>>>>>> mca:btl:openib:param:btl_openib_receive_queues:deprecated:no
>>>>>>>
>>>>>>> I was wondering if these parameters (automatically computed at openib
>>>>>>> btl init for what I understood) were not incorrect in some way and I
>>>>>>> plugged some others values: "P,65536,256,192,128" (someone on the list
>>>>>>> used that values when encountering a different issue) . Since that, I
>>>>>>> haven't been able to observe the segfault (occuring as hrd->tag = 0 in
>>>>>>> btl_openib_component.c:2881) yet.
>>>>>>>
>>>>>>> Eloi
>>>>>>>
>>>>>>>
>>>>>>> /home/pp_fr/st03230/EG/Softs/openmpi-custom-1.4.2/bin/
>>>>>>>
>>>>>>> On Thursday 23 September 2010 23:33:48 Terry Dontje wrote:
>>>>>>>
>>>>>>>> Eloi, I am curious about your problem. Can you tell me what size of
>>>>>>>> job it is? Does it always fail on the same bcast, or same process?
>>>>>>>>
>>>>>>>> Eloi Gaudry wrote:
>>>>>>>>
>>>>>>>>> Hi Nysal,
>>>>>>>>>
>>>>>>>>> Thanks for your suggestions.
>>>>>>>>>
>>>>>>>>> I'm now able to get the checksum computed and redirected to stdout,
>>>>>>>>> thanks (I forgot the "-mca pml_base_verbose 5" option, you were
>>>>>>>>> right). I haven't been able to observe the segmentation fault (with
>>>>>>>>> hdr->tag=0) so far (when using pml csum) but I 'll let you know when
>>>>>>>>> I am.
>>>>>>>>>
>>>>>>>>> I've got two others question, which may be related to the error
>>>>>>>>> observed:
>>>>>>>>>
>>>>>>>>> 1/ does the maximum number of MPI_Comm that can be handled by OpenMPI
>>>>>>>>> somehow depends on the btl being used (i.e. if I'm using openib, may
>>>>>>>>> I use the same number of MPI_Comm object as with tcp) ? Is there
>>>>>>>>> something as MPI_COMM_MAX in OpenMPI ?
>>>>>>>>>
>>>>>>>>> 2/ the segfaults only appears during a mpi collective call, with very
>>>>>>>>> small message (one int is being broadcast, for instance) ; i followed
>>>>>>>>> the guidelines given at http://icl.cs.utk.edu/open-
>>>>>>>>> mpi/faq/?category=openfabrics#ib-small-message-rdma but the
>>>>>>>>> debug-build of OpenMPI asserts if I use a different min-size that
>>>>>>>>> 255. Anyway, if I deactivate eager_rdma, the segfaults remains. Does
>>>>>>>>> the openib btl handle very small message differently (even with
>>>>>>>>> eager_rdma
>>>>>>>>> deactivated) than tcp ?
>>>>>>>>>
>>>>>>>> Others on the list does coalescing happen with non-eager_rdma? If so
>>>>>>>> then that would possibly be one difference between the openib btl and
>>>>>>>> tcp aside from the actual protocol used.
>>>>>>>>
>>>>>>>>
>>>>>>>>> is there a way to make sure that large messages and small messages
>>>>>>>>> are handled the same way ?
>>>>>>>>>
>>>>>>>> Do you mean so they all look like eager messages? How large of
>>>>>>>> messages are we talking about here 1K, 1M or 10M?
>>>>>>>>
>>>>>>>> --td
>>>>>>>>
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Eloi
>>>>>>>>>
>>>>>>>>> On Friday 17 September 2010 17:57:17 Nysal Jan wrote:
>>>>>>>>>
>>>>>>>>>> Hi Eloi,
>>>>>>>>>> Create a debug build of OpenMPI (--enable-debug) and while running
>>>>>>>>>> with the csum PML add "-mca pml_base_verbose 5" to the command line.
>>>>>>>>>> This will print the checksum details for each fragment sent over the
>>>>>>>>>> wire. I'm guessing it didnt catch anything because the BTL failed.
>>>>>>>>>> The checksum verification is done in the PML, which the BTL calls
>>>>>>>>>> via a callback function. In your case the PML callback is never
>>>>>>>>>> called because the hdr->tag is invalid. So enabling checksum
>>>>>>>>>> tracing also might not be of much use. Is it the first Bcast that
>>>>>>>>>> fails or the nth Bcast and what is the message size? I'm not sure
>>>>>>>>>> what could be the problem at this moment. I'm afraid you will have
>>>>>>>>>> to debug the BTL to find out more.
>>>>>>>>>>
>>>>>>>>>> --Nysal
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 17, 2010 at 4:39 PM, Eloi Gaudry <eg_at_[hidden]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Nysal,
>>>>>>>>>>>
>>>>>>>>>>> thanks for your response.
>>>>>>>>>>>
>>>>>>>>>>> I've been unable so far to write a test case that could illustrate
>>>>>>>>>>> the hdr->tag=0 error.
>>>>>>>>>>> Actually, I'm only observing this issue when running an internode
>>>>>>>>>>> computation involving infiniband hardware from Mellanox (MT25418,
>>>>>>>>>>> ConnectX IB DDR, PCIe 2.0
>>>>>>>>>>> 2.5GT/s, rev a0) with our time-domain software.
>>>>>>>>>>>
>>>>>>>>>>> I checked, double-checked, and rechecked again every MPI use
>>>>>>>>>>> performed during a parallel computation and I couldn't find any
>>>>>>>>>>> error so far. The fact that the very
>>>>>>>>>>> same parallel computation run flawlessly when using tcp (and
>>>>>>>>>>> disabling openib support) might seem to indicate that the issue is
>>>>>>>>>>> somewhere located inside the
>>>>>>>>>>> openib btl or at the hardware/driver level.
>>>>>>>>>>>
>>>>>>>>>>> I've just used the "-mca pml csum" option and I haven't seen any
>>>>>>>>>>> related messages (when hdr->tag=0 and the segfaults occurs).
>>>>>>>>>>> Any suggestion ?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Eloi
>>>>>>>>>>>
>>>>>>>>>>> On Friday 17 September 2010 16:03:34 Nysal Jan wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Eloi,
>>>>>>>>>>>> Sorry for the delay in response. I haven't read the entire email
>>>>>>>>>>>> thread, but do you have a test case which can reproduce this
>>>>>>>>>>>> error? Without that it will be difficult to nail down the cause.
>>>>>>>>>>>> Just to clarify, I do not work for an iwarp vendor. I can
>>>>>>>>>>>> certainly try to reproduce it on an IB system. There is also a
>>>>>>>>>>>> PML called csum, you can use it via "-mca pml csum", which will
>>>>>>>>>>>> checksum the MPI messages and verify it at the receiver side for
>>>>>>>>>>>> any data
>>>>>>>>>>>> corruption. You can try using it to see if it is able
>>>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> catch anything.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> --Nysal
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Sep 16, 2010 at 3:48 PM, Eloi Gaudry <eg_at_[hidden]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Nysal,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm sorry to intrrupt, but I was wondering if you had a chance to
>>>>>>>>>>>>> look
>>>>>>>>>>>>>
>>>>>>>>>>> at
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> this error.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Eloi Gaudry
>>>>>>>>>>>>>
>>>>>>>>>>>>> Free Field Technologies
>>>>>>>>>>>>> Company Website: http://www.fft.be
>>>>>>>>>>>>> Company Phone: +32 10 487 959
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>>>>>> From: Eloi Gaudry <eg_at_[hidden]>
>>>>>>>>>>>>> To: Open MPI Users <users_at_[hidden]>
>>>>>>>>>>>>> Date: Wed, 15 Sep 2010 16:27:43 +0200
>>>>>>>>>>>>> Subject: Re: [OMPI users] [openib] segfault when using openib btl
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was wondering if anybody got a chance to have a look at this
>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wednesday 18 August 2010 09:16:26 Eloi Gaudry wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Jeff,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please find enclosed the output (valgrind.out.gz) from
>>>>>>>>>>>>>> /opt/openmpi-debug-1.4.2/bin/orterun -np 2 --host pbn11,pbn10
>>>>>>>>>>>>>> --mca
>>>>>>>>>>>>>>
>>>>>>>>>>> btl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>> openib,self --display-map --verbose --mca mpi_warn_on_fork 0
>>>>>>>>>>>>>> --mca btl_openib_want_fork_support 0 -tag-output
>>>>>>>>>>>>>> /opt/valgrind-3.5.0/bin/valgrind --tool=memcheck
>>>>>>>>>>>>>> --suppressions=/opt/openmpi-debug-1.4.2/share/openmpi/openmpi-
>>>>>>>>>>>>>> valgrind.supp --suppressions=./suppressions.python.supp
>>>>>>>>>>>>>> /opt/actran/bin/actranpy_mp ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tuesday 17 August 2010 09:32:53 Eloi Gaudry wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Monday 16 August 2010 19:14:47 Jeff Squyres wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Aug 16, 2010, at 10:05 AM, Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I did run our application through valgrind but it couldn't
>>>>>>>>>>>>>>>>> find any "Invalid write": there is a bunch of "Invalid read"
>>>>>>>>>>>>>>>>> (I'm using
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> 1.4.2
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> with the suppression file), "Use of uninitialized bytes" and
>>>>>>>>>>>>>>>>> "Conditional jump depending on uninitialized bytes" in
>>>>>>>>>>>>>>>>>
>>>>>>>>>>> different
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> ompi
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> routines. Some of them are located in btl_openib_component.c.
>>>>>>>>>>>>>>>>> I'll send you an output of valgrind shortly.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A lot of them in btl_openib_* are to be expected --
>>>>>>>>>>>>>>>> OpenFabrics uses OS-bypass methods for some of its memory,
>>>>>>>>>>>>>>>> and therefore valgrind is unaware of them (and therefore
>>>>>>>>>>>>>>>> incorrectly marks them as
>>>>>>>>>>>>>>>> uninitialized).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> would it help if i use the upcoming 1.5 version of openmpi ? i
>>>>>>>>>>>>>>>
>>>>>>>>>>> read
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> that
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a huge effort has been done to clean-up the valgrind output ?
>>>>>>>>>>>>>>> but maybe that this doesn't concern this btl (for the reasons
>>>>>>>>>>>>>>> you mentionned).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Another question, you said that the callback function pointer
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> should
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> never be 0. But can the tag be null (hdr->tag) ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The tag is not a pointer -- it's just an integer.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I was worrying that its value could not be null.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'll send a valgrind output soon (i need to build libpython
>>>>>>>>>>>>>>> without pymalloc first).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for your help,
>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 16/08/2010 18:22, Jeff Squyres wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sorry for the delay in replying.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Odd; the values of the callback function pointer should
>>>>>>>>>>>>>>>>>> never
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> 0.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This seems to suggest some kind of memory corruption is
>>>>>>>>>>>>>>>>>> occurring.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I don't know if it's possible, because the stack trace looks
>>>>>>>>>>>>>>>>>> like you're calling through python, but can you run this
>>>>>>>>>>>>>>>>>> application through valgrind, or some other memory-checking
>>>>>>>>>>>>>>>>>> debugger?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Aug 10, 2010, at 7:15 AM, Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> sorry, i just forgot to add the values of the function
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> parameters:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (gdb) print reg->cbdata
>>>>>>>>>>>>>>>>>>> $1 = (void *) 0x0
>>>>>>>>>>>>>>>>>>> (gdb) print openib_btl->super
>>>>>>>>>>>>>>>>>>> $2 = {btl_component = 0x2b341edd7380, btl_eager_limit =
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> 12288,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> btl_rndv_eager_limit = 12288, btl_max_send_size = 65536,
>>>>>>>>>>>>>>>>>>> btl_rdma_pipeline_send_length = 1048576,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> btl_rdma_pipeline_frag_size = 1048576,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> btl_min_rdma_pipeline_size
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> = 1060864, btl_exclusivity = 1024, btl_latency = 10,
>>>>>>>>>>>>>>>>>>> btl_bandwidth = 800, btl_flags = 310, btl_add_procs =
>>>>>>>>>>>>>>>>>>> 0x2b341eb8ee47<mca_btl_openib_add_procs>, btl_del_procs =
>>>>>>>>>>>>>>>>>>> 0x2b341eb90156<mca_btl_openib_del_procs>, btl_register =
>>>>>>>>>>>>>>>>>>> 0, btl_finalize =
>>>>>>>>>>>>>>>>>>> 0x2b341eb93186<mca_btl_openib_finalize>,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> btl_alloc
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> = 0x2b341eb90a3e<mca_btl_openib_alloc>, btl_free =
>>>>>>>>>>>>>>>>>>> 0x2b341eb91400<mca_btl_openib_free>, btl_prepare_src =
>>>>>>>>>>>>>>>>>>> 0x2b341eb91813<mca_btl_openib_prepare_src>,
>>>>>>>>>>>>>>>>>>> btl_prepare_dst
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> =
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 0x2b341eb91f2e<mca_btl_openib_prepare_dst>, btl_send =
>>>>>>>>>>>>>>>>>>> 0x2b341eb94517<mca_btl_openib_send>, btl_sendi =
>>>>>>>>>>>>>>>>>>> 0x2b341eb9340d<mca_btl_openib_sendi>, btl_put =
>>>>>>>>>>>>>>>>>>> 0x2b341eb94660<mca_btl_openib_put>, btl_get =
>>>>>>>>>>>>>>>>>>> 0x2b341eb94c4e<mca_btl_openib_get>, btl_dump =
>>>>>>>>>>>>>>>>>>> 0x2b341acd45cb<mca_btl_base_dump>, btl_mpool = 0xf3f4110,
>>>>>>>>>>>>>>>>>>> btl_register_error =
>>>>>>>>>>>>>>>>>>> 0x2b341eb90565<mca_btl_openib_register_error_cb>,
>>>>>>>>>>>>>>>>>>> btl_ft_event
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> =
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 0x2b341eb952e7<mca_btl_openib_ft_event>}
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (gdb) print hdr->tag
>>>>>>>>>>>>>>>>>>> $3 = 0 '\0'
>>>>>>>>>>>>>>>>>>> (gdb) print des
>>>>>>>>>>>>>>>>>>> $4 = (mca_btl_base_descriptor_t *) 0xf4a6700
>>>>>>>>>>>>>>>>>>> (gdb) print reg->cbfunc
>>>>>>>>>>>>>>>>>>> $5 = (mca_btl_base_module_recv_cb_fn_t) 0
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tuesday 10 August 2010 16:04:08 Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Here is the output of a core file generated during a
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> segmentation
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> fault observed during a collective call (using openib):
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> #0 0x0000000000000000 in ?? ()
>>>>>>>>>>>>>>>>>>>> (gdb) where
>>>>>>>>>>>>>>>>>>>> #0 0x0000000000000000 in ?? ()
>>>>>>>>>>>>>>>>>>>> #1 0x00002aedbc4e05f4 in btl_openib_handle_incoming
>>>>>>>>>>>>>>>>>>>> (openib_btl=0x1902f9b0, ep=0x1908a1c0, frag=0x190d9700,
>>>>>>>>>>>>>>>>>>>> byte_len=18) at btl_openib_component.c:2881 #2
>>>>>>>>>>>>>>>>>>>> 0x00002aedbc4e25e2 in handle_wc (device=0x19024ac0, cq=0,
>>>>>>>>>>>>>>>>>>>> wc=0x7ffff279ce90) at
>>>>>>>>>>>>>>>>>>>> btl_openib_component.c:3178 #3 0x00002aedbc4e2e9d in
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> poll_device
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> (device=0x19024ac0, count=2) at
>>>>>>>>>>>>>>>>>>>> btl_openib_component.c:3318
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> #4
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 0x00002aedbc4e34b8 in progress_one_device
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (device=0x19024ac0)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> at btl_openib_component.c:3426 #5 0x00002aedbc4e3561 in
>>>>>>>>>>>>>>>>>>>> btl_openib_component_progress () at
>>>>>>>>>>>>>>>>>>>> btl_openib_component.c:3451
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> #6
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 0x00002aedb8b22ab8 in opal_progress () at
>>>>>>>>>>>>>>>>>>>> runtime/opal_progress.c:207 #7 0x00002aedb859f497 in
>>>>>>>>>>>>>>>>>>>> opal_condition_wait (c=0x2aedb888ccc0, m=0x2aedb888cd20)
>>>>>>>>>>>>>>>>>>>> at ../opal/threads/condition.h:99 #8
>>>>>>>>>>>>>>>>>>>> 0x00002aedb859fa31 in ompi_request_default_wait_all
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (count=2,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> requests=0x7ffff279d0e0, statuses=0x0) at
>>>>>>>>>>>>>>>>>>>> request/req_wait.c:262 #9 0x00002aedbd7559ad in
>>>>>>>>>>>>>>>>>>>> ompi_coll_tuned_allreduce_intra_recursivedoubling
>>>>>>>>>>>>>>>>>>>> (sbuf=0x7ffff279d444, rbuf=0x7ffff279d440, count=1,
>>>>>>>>>>>>>>>>>>>> dtype=0x6788220, op=0x6787a20,
>>>>>>>>>>>>>>>>>>>> comm=0x19d81ff0, module=0x19d82b20) at
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> coll_tuned_allreduce.c:223
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> #10 0x00002aedbd7514f7 in
>>>>>>>>>>>>>>>>>>>> ompi_coll_tuned_allreduce_intra_dec_fixed
>>>>>>>>>>>>>>>>>>>> (sbuf=0x7ffff279d444, rbuf=0x7ffff279d440, count=1,
>>>>>>>>>>>>>>>>>>>> dtype=0x6788220, op=0x6787a20, comm=0x19d81ff0,
>>>>>>>>>>>>>>>>>>>> module=0x19d82b20) at
>>>>>>>>>>>>>>>>>>>> coll_tuned_decision_fixed.c:63
>>>>>>>>>>>>>>>>>>>> #11 0x00002aedb85c7792 in PMPI_Allreduce
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> (sendbuf=0x7ffff279d444,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> recvbuf=0x7ffff279d440, count=1, datatype=0x6788220,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> op=0x6787a20,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> comm=0x19d81ff0) at pallreduce.c:102 #12
>>>>>>>>>>>>>>>>>>>> 0x0000000004387dbf
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> in
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> FEMTown::MPI::Allreduce (sendbuf=0x7ffff279d444,
>>>>>>>>>>>>>>>>>>>> recvbuf=0x7ffff279d440, count=1, datatype=0x6788220,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> op=0x6787a20,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> comm=0x19d81ff0) at stubs.cpp:626 #13 0x0000000004058be8
>>>>>>>>>>>>>>>>>>>> in FEMTown::Domain::align (itf=
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> {<FEMTown::Boost::shared_base_ptr<FEMTown::Domain::Int
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> er fa ce>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> = {_vptr.shared_base_ptr = 0x7ffff279d620, ptr_ = {px =
>>>>>>>>>>>>>>>>>>>> 0x199942a4, pn = {pi_ = 0x6}}},<No data fields>}) at
>>>>>>>>>>>>>>>>>>>> interface.cpp:371 #14 0x00000000040cb858 in
>>>>>>>>>>>>>>>>>>>> FEMTown::Field::detail::align_itfs_and_neighbhors (dim=2,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> set={px
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> = 0x7ffff279d780, pn = {pi_ = 0x2f279d640}},
>>>>>>>>>>>>>>>>>>>> check_info=@0x7ffff279d7f0) at check.cpp:63 #15
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> 0x00000000040cbfa8
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> in FEMTown::Field::align_elements (set={px =
>>>>>>>>>>>>>>>>>>>> 0x7ffff279d950, pn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> =
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> {pi_ = 0x66e08d0}}, check_info=@0x7ffff279d7f0) at
>>>>>>>>>>>>>>>>>>>> check.cpp:159 #16 0x00000000039acdd4 in
>>>>>>>>>>>>>>>>>>>> PyField_align_elements (self=0x0, args=0x2aaab0765050,
>>>>>>>>>>>>>>>>>>>> kwds=0x19d2e950) at check.cpp:31 #17 0x0000000001fbf76d in
>>>>>>>>>>>>>>>>>>>> FEMTown::Main::ExErrCatch<_object* (*)(_object*, _object*,
>>>>>>>>>>>>>>>>>>>> _object*)>::exec<_object>
>>>>>>>>>>>>>>>>>>>> (this=0x7ffff279dc20, s=0x0, po1=0x2aaab0765050,
>>>>>>>>>>>>>>>>>>>> po2=0x19d2e950) at
>>>>>>>>>>>>>>>>>>>> /home/qa/svntop/femtown/modules/main/py/exception.hpp:463
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> #18
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 0x00000000039acc82 in PyField_align_elements_ewrap
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (self=0x0,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> args=0x2aaab0765050, kwds=0x19d2e950) at check.cpp:39 #19
>>>>>>>>>>>>>>>>>>>> 0x00000000044093a0 in PyEval_EvalFrameEx (f=0x19b52e90,
>>>>>>>>>>>>>>>>>>>> throwflag=<value optimized out>) at Python/ceval.c:3921
>>>>>>>>>>>>>>>>>>>> #20 0x000000000440aae9 in PyEval_EvalCodeEx
>>>>>>>>>>>>>>>>>>>> (co=0x2aaab754ad50, globals=<value optimized out>,
>>>>>>>>>>>>>>>>>>>> locals=<value optimized out>, args=0x3, argcount=1,
>>>>>>>>>>>>>>>>>>>> kws=0x19ace4a0, kwcount=2,
>>>>>>>>>>>>>>>>>>>> defs=0x2aaab75e4800, defcount=2, closure=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:2968
>>>>>>>>>>>>>>>>>>>> #21 0x0000000004408f58 in PyEval_EvalFrameEx
>>>>>>>>>>>>>>>>>>>> (f=0x19ace2d0, throwflag=<value optimized out>) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:3802 #22 0x000000000440aae9 in
>>>>>>>>>>>>>>>>>>>> PyEval_EvalCodeEx (co=0x2aaab7550120, globals=<value
>>>>>>>>>>>>>>>>>>>> optimized out>, locals=<value optimized out>, args=0x7,
>>>>>>>>>>>>>>>>>>>> argcount=1, kws=0x19acc418, kwcount=3,
>>>>>>>>>>>>>>>>>>>> defs=0x2aaab759e958, defcount=6, closure=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:2968
>>>>>>>>>>>>>>>>>>>> #23 0x0000000004408f58 in PyEval_EvalFrameEx
>>>>>>>>>>>>>>>>>>>> (f=0x19acc1c0, throwflag=<value optimized out>) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:3802 #24 0x000000000440aae9 in
>>>>>>>>>>>>>>>>>>>> PyEval_EvalCodeEx (co=0x2aaab8b5e738, globals=<value
>>>>>>>>>>>>>>>>>>>> optimized out>, locals=<value optimized out>, args=0x6,
>>>>>>>>>>>>>>>>>>>> argcount=1, kws=0x19abd328, kwcount=5,
>>>>>>>>>>>>>>>>>>>> defs=0x2aaab891b7e8, defcount=3, closure=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:2968
>>>>>>>>>>>>>>>>>>>> #25 0x0000000004408f58 in PyEval_EvalFrameEx
>>>>>>>>>>>>>>>>>>>> (f=0x19abcea0, throwflag=<value optimized out>) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:3802 #26 0x000000000440aae9 in
>>>>>>>>>>>>>>>>>>>> PyEval_EvalCodeEx (co=0x2aaab3eb4198, globals=<value
>>>>>>>>>>>>>>>>>>>> optimized out>, locals=<value optimized out>, args=0xb,
>>>>>>>>>>>>>>>>>>>> argcount=1, kws=0x19a89df0, kwcount=10, defs=0x0,
>>>>>>>>>>>>>>>>>>>> defcount=0, closure=0x0) at Python/ceval.c:2968
>>>>>>>>>>>>>>>>>>>> #27 0x0000000004408f58 in PyEval_EvalFrameEx
>>>>>>>>>>>>>>>>>>>> (f=0x19a89c40, throwflag=<value optimized out>) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:3802 #28 0x000000000440aae9 in
>>>>>>>>>>>>>>>>>>>> PyEval_EvalCodeEx (co=0x2aaab3eb4288, globals=<value
>>>>>>>>>>>>>>>>>>>> optimized out>, locals=<value optimized out>, args=0x1,
>>>>>>>>>>>>>>>>>>>> argcount=0, kws=0x19a89330, kwcount=0,
>>>>>>>>>>>>>>>>>>>> defs=0x2aaab8b66668, defcount=1, closure=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:2968
>>>>>>>>>>>>>>>>>>>> #29 0x0000000004408f58 in PyEval_EvalFrameEx
>>>>>>>>>>>>>>>>>>>> (f=0x19a891b0, throwflag=<value optimized out>) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:3802 #30 0x000000000440aae9 in
>>>>>>>>>>>>>>>>>>>> PyEval_EvalCodeEx (co=0x2aaab8b6a738, globals=<value
>>>>>>>>>>>>>>>>>>>> optimized out>, locals=<value optimized out>, args=0x0,
>>>>>>>>>>>>>>>>>>>> argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0,
>>>>>>>>>>>>>>>>>>>> closure=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/ceval.c:2968
>>>>>>>>>>>>>>>>>>>> #31 0x000000000440ac02 in PyEval_EvalCode (co=0x1902f9b0,
>>>>>>>>>>>>>>>>>>>> globals=0x0, locals=0x190d9700) at Python/ceval.c:522 #32
>>>>>>>>>>>>>>>>>>>> 0x000000000442853c in PyRun_StringFlags (str=0x192fd3d8
>>>>>>>>>>>>>>>>>>>> "DIRECT.Actran.main()", start=<value optimized out>,
>>>>>>>>>>>>>>>>>>>> globals=0x192213d0, locals=0x192213d0, flags=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/pythonrun.c:1335 #33 0x0000000004429690 in
>>>>>>>>>>>>>>>>>>>> PyRun_SimpleStringFlags (command=0x192fd3d8
>>>>>>>>>>>>>>>>>>>> "DIRECT.Actran.main()", flags=0x0) at
>>>>>>>>>>>>>>>>>>>> Python/pythonrun.c:957 #34 0x0000000001fa1cf9 in
>>>>>>>>>>>>>>>>>>>> FEMTown::Python::FEMPy::run_application
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (this=0x7ffff279f650)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> at fempy.cpp:873 #35 0x000000000434ce99 in
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> FEMTown::Main::Batch::run
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> (this=0x7ffff279f650) at batch.cpp:374 #36
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> 0x0000000001f9aa25
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> in main (argc=8, argv=0x7ffff279fa48) at main.cpp:10 (gdb)
>>>>>>>>>>>>>>>>>>>> f 1 #1 0x00002aedbc4e05f4 in btl_openib_handle_incoming
>>>>>>>>>>>>>>>>>>>> (openib_btl=0x1902f9b0, ep=0x1908a1c0, frag=0x190d9700,
>>>>>>>>>>>>>>>>>>>> byte_len=18) at btl_openib_component.c:2881 2881
>>>>>>>>>>>>>>>>>>>> reg->cbfunc( &openib_btl->super, hdr->tag, des,
>>>>>>>>>>>>>>>>>>>> reg->cbdata
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> );
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Current language: auto; currently c
>>>>>>>>>>>>>>>>>>>> (gdb)
>>>>>>>>>>>>>>>>>>>> #1 0x00002aedbc4e05f4 in btl_openib_handle_incoming
>>>>>>>>>>>>>>>>>>>> (openib_btl=0x1902f9b0, ep=0x1908a1c0, frag=0x190d9700,
>>>>>>>>>>>>>>>>>>>> byte_len=18) at btl_openib_component.c:2881 2881
>>>>>>>>>>>>>>>>>>>> reg->cbfunc( &openib_btl->super, hdr->tag, des,
>>>>>>>>>>>>>>>>>>>> reg->cbdata
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> );
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> (gdb) l 2876
>>>>>>>>>>>>>>>>>>>> 2877 if(OPAL_LIKELY(!(is_credit_msg =
>>>>>>>>>>>>>>>>>>>> is_credit_message(frag)))) { 2878 /* call
>>>>>>>>>>>>>>>>>>>> registered callback */
>>>>>>>>>>>>>>>>>>>> 2879 mca_btl_active_message_callback_t* reg;
>>>>>>>>>>>>>>>>>>>> 2880 reg = mca_btl_base_active_message_trigger
>>>>>>>>>>>>>>>>>>>> + hdr->tag; 2881
>>>>>>>>>>>>>>>>>>>> reg->cbfunc(&openib_btl->super, hdr->tag, des,
>>>>>>>>>>>>>>>>>>>> reg->cbdata ); 2882
>>>>>>>>>>>>>>>>>>>> if(MCA_BTL_OPENIB_RDMA_FRAG(frag)) { 2883
>>>>>>>>>>>>>>>>>>>> cqp
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> =
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> (hdr->credits>> 11)& 0x0f;
>>>>>>>>>>>>>>>>>>>> 2884 hdr->credits&= 0x87ff;
>>>>>>>>>>>>>>>>>>>> 2885 } else {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Friday 16 July 2010 16:01:02 Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Edgar,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The only difference I could observed was that the
>>>>>>>>>>>>>>>>>>>>> segmentation fault appeared sometimes later during the
>>>>>>>>>>>>>>>>>>>>> parallel computation.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I'm running out of idea here. I wish I could use the
>>>>>>>>>>>>>>>>>>>>> "--mca
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> coll
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> tuned" with "--mca self,sm,tcp" so that I could check
>>>>>>>>>>>>>>>>>>>>> that the issue is not somehow limited to the tuned
>>>>>>>>>>>>>>>>>>>>> collective routines.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thursday 15 July 2010 17:24:24 Edgar Gabriel wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 7/15/2010 10:18 AM, Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> hi edgar,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> thanks for the tips, I'm gonna try this option as well.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> segmentation fault i'm observing always happened during
>>>>>>>>>>>>>>>>>>>>>>> a collective communication indeed... does it basically
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> switch
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> all
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> collective communication to basic mode, right ?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> sorry for my ignorance, but what's a NCA ?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> sorry, I meant to type HCA (InifinBand networking card)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>> Edgar
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>>>>>>> éloi
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thursday 15 July 2010 16:20:54 Edgar Gabriel wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> you could try first to use the algorithms in the basic
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> module,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> e.g.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> mpirun -np x --mca coll basic ./mytest
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> and see whether this makes a difference. I used to
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> observe
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> sometimes a (similar ?) problem in the openib btl
>>>>>>>>>>>>>>>>>>>>>>>> triggered from the tuned collective component, in
>>>>>>>>>>>>>>>>>>>>>>>> cases where the ofed libraries were installed but no
>>>>>>>>>>>>>>>>>>>>>>>> NCA was found on a node. It used to work however with
>>>>>>>>>>>>>>>>>>>>>>>> the basic component.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>>>> Edgar
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 7/15/2010 3:08 AM, Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> hi Rolf,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> unfortunately, i couldn't get rid of that annoying
>>>>>>>>>>>>>>>>>>>>>>>>> segmentation fault when selecting another bcast
>>>>>>>>>>>>>>>>>>>>>>>>> algorithm. i'm now going to replace MPI_Bcast with a
>>>>>>>>>>>>>>>>>>>>>>>>> naive
>>>>>>>>>>>>>>>>>>>>>>>>> implementation (using MPI_Send and MPI_Recv) and see
>>>>>>>>>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> that
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> helps.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> regards,
>>>>>>>>>>>>>>>>>>>>>>>>> éloi
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Wednesday 14 July 2010 10:59:53 Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Rolf,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for your input. You're right, I miss the
>>>>>>>>>>>>>>>>>>>>>>>>>> coll_tuned_use_dynamic_rules option.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I'll check if I the segmentation fault disappears
>>>>>>>>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> using
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> the basic bcast linear algorithm using the proper
>>>>>>>>>>>>>>>>>>>>>>>>>> command line you provided.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Tuesday 13 July 2010 20:39:59 Rolf vandeVaart
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Eloi:
>>>>>>>>>>>>>>>>>>>>>>>>>>> To select the different bcast algorithms, you need
>>>>>>>>>>>>>>>>>>>>>>>>>>> to add an extra mca parameter that tells the
>>>>>>>>>>>>>>>>>>>>>>>>>>> library to use dynamic selection. --mca
>>>>>>>>>>>>>>>>>>>>>>>>>>> coll_tuned_use_dynamic_rules 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> One way to make sure you are typing this in
>>>>>>>>>>>>>>>>>>>>>>>>>>> correctly is
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> use it with ompi_info. Do the following:
>>>>>>>>>>>>>>>>>>>>>>>>>>> ompi_info -mca coll_tuned_use_dynamic_rules 1
>>>>>>>>>>>>>>>>>>>>>>>>>>> --param
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> coll
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> You should see lots of output with all the
>>>>>>>>>>>>>>>>>>>>>>>>>>> different algorithms that can be selected for the
>>>>>>>>>>>>>>>>>>>>>>>>>>> various collectives. Therefore, you need this:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> --mca coll_tuned_use_dynamic_rules 1 --mca
>>>>>>>>>>>>>>>>>>>>>>>>>>> coll_tuned_bcast_algorithm 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Rolf
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On 07/13/10 11:28, Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've found that "--mca coll_tuned_bcast_algorithm
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1" allowed to switch to the basic linear
>>>>>>>>>>>>>>>>>>>>>>>>>>>> algorithm. Anyway whatever the algorithm used,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the segmentation fault remains.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Does anyone could give some advice on ways to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> diagnose
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> issue I'm facing ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Monday 12 July 2010 10:53:58 Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm focusing on the MPI_Bcast routine that seems
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to randomly segfault when using the openib btl.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'd
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> like
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know if there is any way to make OpenMPI switch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> a
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different algorithm than the default one being
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> selected for MPI_Bcast.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for your help,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Friday 02 July 2010 11:06:52 Eloi Gaudry wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm observing a random segmentation fault during
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> an
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> internode parallel computation involving the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> openib
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> btl
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and OpenMPI-1.4.2 (the same issue can be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> observed with OpenMPI-1.3.3).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mpirun (Open MPI) 1.4.2
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Report bugs to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/community/help/
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [pbn08:02624] *** Process received signal ***
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [pbn08:02624] Signal: Segmentation fault (11)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [pbn08:02624] Signal code: Address not mapped
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (1)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [pbn08:02624] Failing at address: (nil)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [pbn08:02624] [ 0] /lib64/libpthread.so.0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [0x349540e4c0] [pbn08:02624] *** End of error
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> message
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ***
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sh: line 1: 2624 Segmentation fault
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> \/share\/hpc3\/actran_suite\/Actran_11\.0\.rc2\.41872\/R
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ed Ha tE L\ -5 \/ x 86 _6 4\ /bin\/actranpy_mp
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> '--apl=/share/hpc3/actran_suite/Actran_11.0.rc2.41872/Re
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dH at EL -5 /x 86 _ 64 /A c tran_11.0.rc2.41872'
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> '--inputfile=/work/st25652/LSF_130073_0_47696_0/Case1_3D
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> re al _m 4_ n2 .d a t'
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> '--scratch=/scratch/st25652/LSF_130073_0_47696_0/scratch
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ' '--mem=3200' '--threads=1'
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> '--errorlevel=FATAL' '--t_max=0.1'
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> '--parallel=domain'
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If I choose not to use the openib btl (by using
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --mca btl self,sm,tcp on the command line, for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instance), I don't encounter any problem and the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parallel computation runs flawlessly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to get some help to be able:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - to diagnose the issue I'm facing with the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> openib btl - understand why this issue is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> observed only when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> using
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the openib btl and not when using self,sm,tcp
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Any help would be very much appreciated.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The outputs of ompi_info and the configure
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scripts of OpenMPI are enclosed to this email,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> information
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on the infiniband drivers as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Here is the command line used when launching a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> parallel
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computation
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> using infiniband:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> path_to_openmpi/bin/mpirun -np $NPROCESS
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --hostfile host.list --mca
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> btl openib,sm,self,tcp --display-map --verbose
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --version --mca mpi_warn_on_fork 0 --mca
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> btl_openib_want_fork_support 0 [...]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and the command line used if not using infiniband:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> path_to_openmpi/bin/mpirun -np $NPROCESS
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --hostfile host.list --mca
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> btl self,sm,tcp --display-map --verbose
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --version
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> --mca
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mpi_warn_on_fork 0 --mca
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> btl_openib_want_fork_support
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> 0
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eloi
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>
>>>
>

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>



picture