Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2011-12-15 17:49:48


Whats odd is totalview, STAT, and GDB see the correct values despite them being in the B section. What does padb do differently?

This is a dynamic, optimized build of 1.5.5rc1.

-Nathan Hjelm
HPC-3, LANL

On Thu, 15 Dec 2011, Ashley Pittman wrote:

>
> If I add a new symbol to orte/mca/debugger/base/debugger_base_open.c and declare it in orte/mca/debugger/base/base.h, the same as MPIR_proctable_size is defined then it appears in the .so but not in the binary, if I then reference this variable in orte/tools/orterun/orterun.c the symbol appears in orterun. It's definably coming from that declaration, what isn't so clear is how it's getting into the binary. I can only assume that orte/mca/debugger/base/debugger_base_fns.c is linked into the binary directly and the symbol is optimised away in the case where it's defined but not used.
>
> Ashley.
>
> On 15 Dec 2011, at 22:09, Nathan Hjelm wrote:
>
>> orte/tools/orterun/debuggers.c does not exist anymore (its not in the 1.5.5rc1 tarball). I don't know why the symbols are showing up in section B of orterun. Investigating now.
>>
>> -Nathan Hjelm
>> HPC-3, LANL
>>
>> On Thu, 15 Dec 2011, George Bosilca wrote:
>>
>>>
>>> On Dec 15, 2011, at 16:55 , Ashley Pittman wrote:
>>>
>>>> There is a problem with 1.5.5rc1 that prevents padb from loading the process table start from the orterun process, what appears to be happening is that MPIR_proctable and MPIR_proctable_size is present in both orterun itself and also in libopen-rte.so, the code is correctly setting them in libopen-rte.so however when gdb is picking the variable from orterun in preference and hence padb is reading NULL values.
>>>
>>> Indeed, there are two definitions, but a single declaration. This is true for both the trunk and the 1.5.
>>>
>>> ./orte/mca/debugger/base/base.h:61:ORTE_DECLSPEC extern struct MPIR_PROCDESC *MPIR_proctable;
>>> ./orte/mca/debugger/base/base.h:62:ORTE_DECLSPEC extern int MPIR_proctable_size;
>>>
>>> ./orte/mca/debugger/base/debugger_base_open.c:42:struct MPIR_PROCDESC *MPIR_proctable = NULL;
>>> ./orte/mca/debugger/base/debugger_base_open.c:43:int MPIR_proctable_size = 0;
>>>
>>> ./orte/tools/orterun/debuggers.c:142:struct MPIR_PROCDESC *MPIR_proctable = NULL;
>>> ./orte/tools/orterun/debuggers.c:143:int MPIR_proctable_size = 0;
>>>
>>> george.
>>>
>>>
>>>> Attached is a log showing the problem, the only change I made to the source is to add a call to orte_debugger_base_dump() before the return from orte_debugger_base_init_after_spawn(), it looks like this could also have been achieved via a debug setting but I couldn't see how.
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>