Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-07-30 11:34:02

Thanks Ralph and Gilles! All is looking good for me again. I think all tests are passing again. Will check results again tomorrow.

From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Ralph Castain
Sent: Wednesday, July 30, 2014 10:49 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

I just fixed this one - all that was required was an ampersand as the name was being passed into the function instead of a pointer to the name


On Jul 30, 2014, at 7:43 AM, Gilles GOUAILLARDET <gilles.gouaillardet_at_[hidden]<mailto:gilles.gouaillardet_at_[hidden]>> wrote:


r32353 can be seen as a suspect...
Even if it is correct, it might have exposed the bug discussed in #4815 even more (e.g. we hit the bug 100% after the fix)

does the attached patch to #4815 fixes the problem ?

If yes, and if you see this issue as a showstopper, feel free to commit it and drop a note to #4815
( I am afk until tomorrow)



Rolf vandeVaart <rvandevaart_at_[hidden]<mailto:rvandevaart_at_[hidden]>> wrote:

Just an FYI that my trunk version (r32355) does not work at all anymore if I do not include "--mca coll ^ml". Here is a stack trace from the ibm/pt2pt/send test running on a single node.

(gdb) where

#0 0x00007f6c0d1321d0 in ?? ()

#1 <signal handler called>

#2 0x00007f6c183abd52 in orte_util_compare_name_fields (fields=15 '\017', name1=0x192350001, name2=0xbaf76c) at ../../orte/util/name_fns.c:522

#3 0x00007f6c0bea17be in bcol_basesmuma_smcm_allgather_connection (sm_bcol_module=0x7f6bf3b68040, module=0xb3d200, peer_list=0x7f6c0c0a6748, back_files=0x7f6bf3ffd6c8,

    comm=0x6037a0, input=..., base_fname=0x7f6c0bea2606 "sm_payload_mem_", map_all=false) at ../../../../../ompi/mca/bcol/basesmuma/bcol_basesmuma_smcm.c:237

#4 0x00007f6c0be98307 in bcol_basesmuma_bank_init_opti (payload_block=0xbc0f60, data_offset=64, bcol_module=0x7f6bf3b68040, reg_data=0xba28c0)

    at ../../../../../ompi/mca/bcol/basesmuma/bcol_basesmuma_buf_mgmt.c:302

#5 0x00007f6c0cced386 in mca_coll_ml_register_bcols (ml_module=0xba5c40) at ../../../../../ompi/mca/coll/ml/coll_ml_module.c:510

#6 0x00007f6c0cced68f in ml_module_memory_initialization (ml_module=0xba5c40) at ../../../../../ompi/mca/coll/ml/coll_ml_module.c:558

#7 0x00007f6c0ccf06b1 in ml_discover_hierarchy (ml_module=0xba5c40) at ../../../../../ompi/mca/coll/ml/coll_ml_module.c:1539

#8 0x00007f6c0ccf4e0b in mca_coll_ml_comm_query (comm=0x6037a0, priority=0x7fffe7991b58) at ../../../../../ompi/mca/coll/ml/coll_ml_module.c:2963

#9 0x00007f6c18cc5b09 in query_2_0_0 (component=0x7f6c0cf50940, comm=0x6037a0, priority=0x7fffe7991b58, module=0x7fffe7991b90)

    at ../../../../ompi/mca/coll/base/coll_base_comm_select.c:372

#10 0x00007f6c18cc5ac8 in query (component=0x7f6c0cf50940, comm=0x6037a0, priority=0x7fffe7991b58, module=0x7fffe7991b90)

    at ../../../../ompi/mca/coll/base/coll_base_comm_select.c:355

#11 0x00007f6c18cc59d2 in check_one_component (comm=0x6037a0, component=0x7f6c0cf50940, module=0x7fffe7991b90)

    at ../../../../ompi/mca/coll/base/coll_base_comm_select.c:317

#12 0x00007f6c18cc5818 in check_components (components=0x7f6c18f46ef0, comm=0x6037a0) at ../../../../ompi/mca/coll/base/coll_base_comm_select.c:281

#13 0x00007f6c18cbe3c9 in mca_coll_base_comm_select (comm=0x6037a0) at ../../../../ompi/mca/coll/base/coll_base_comm_select.c:117

#14 0x00007f6c18c52301 in ompi_mpi_init (argc=1, argv=0x7fffe79924c8, requested=0, provided=0x7fffe79922e8) at ../../ompi/runtime/ompi_mpi_init.c:918

#15 0x00007f6c18c86e92 in PMPI_Init (argc=0x7fffe799234c, argv=0x7fffe7992340) at pinit.c:84

#16 0x0000000000401056 in main (argc=1, argv=0x7fffe79924c8) at send.c:32

(gdb) up

#1 <signal handler called>

(gdb) up

#2 0x00007f6c183abd52 in orte_util_compare_name_fields (fields=15 '\017', name1=0x192350001, name2=0xbaf76c) at ../../orte/util/name_fns.c:522

522 if (name1->jobid < name2->jobid) {

(gdb) print name1

$1 = (const orte_process_name_t *) 0x192350001

(gdb) print *name1

Cannot access memory at address 0x192350001

(gdb) print name2

$2 = (const orte_process_name_t *) 0xbaf76c

(gdb) print *name2

$3 = {jobid = 2452946945, vpid = 1}


>-----Original Message-----

>From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Gilles


>Sent: Wednesday, July 30, 2014 2:16 AM

>To: Open MPI Developers

>Subject: Re: [OMPI devel] trunk compilation errors in jenkins




>#4815 is indirectly related to the move :


>in bcol/basesmuma, we used to compare ompi_process_name_t, and now

>we (try to) compare an ompi_process_name_t and an opal_process_name_t

>(which causes a glory SIGSEGV)


>i proposed a temporary patch which is both broken and unelegant, could you

>please advise a correct solution ?






>On 2014/07/27 7:37, George Bosilca wrote:

>> If you have any issue with the move, I'll be happy to help and/or support

>you on your last move toward a completely generic BTL. To facilitate your

>work I exposed a minimalistic set of OMPI information at the OPAL level. Take

>a look at opal/util/proc.h for more info, but please try not to expose more.



>devel mailing list



>Link to this post:>


This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
devel mailing list
Link to this post: