Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: bug in mca framework?
From: Igor Ivanov (igor.ivanov_at_[hidden])
Date: 2013-12-23 05:49:31


Brian,

Could you look at patch based on your suggestion. It resolves the issue
with mca variable.

Igor

On 18.12.2013 01:48, Barrett, Brian W wrote:
> The proposed solution at the bottom is wrong. There aren't two different
> BMLs, there's one, and it lives in OMPI.
>
> The solution is to open the bml and btls in ompi_mpi_init and not in the
> pmls. I checked, and the bml will deal with add_procs being called
> multiple times on the same proc, so just moving the framework open / init
> is sufficient. This will also solve the MTL problem.
>
> Brian
>
> On 12/17/13 8:33 AM, "Joshua Ladd" <joshual_at_[hidden]> wrote:
>
>> I believe Devendar Bureddy nailed the root cause. I am providing his
>> excellent analysis below:
>>
> >From Devendar:
>> with curiosity i looked at this issue. here's my 2 cents
>> I think issue is because of BTL components is opened&closed
>> twice(ompi_init, yoda) which leading to incorrect usage of var groups.
>> The following sequence of events creating invalid memory
>>
>> 1) all openib component parameters registered in ompi_mpi_init
>> main > start_pes> shmem_init -> oshmem_shmem_init -> ompi_mpi_init ->
>> mca_base_framework_open -> mca_pml_base_open ..... mca_bml_base_open...
>> -> btl_openib_component_register()
>>
>> * for all string variables it allocated a memory block (var->mbv_storage
>> = PTR)
>>
>> At this time a new var group id:114 (of parent group id: 112) is created
>> for all openib component variables.
>>
>> 2) This var group is de-registered in ompi_mpi_init. It marks all
>> variables as invalid. but, the group&vars is still exist
>> main > start_pes> shmem_init -> oshmem_shmem_init -> mca_pml_base_select
>> -> mca_base_components_close -> ... -> mca_bml_base_close ->
>> mca_base_framework_close -> mca_base_var_group_deregister(groupid: 114) *
>> all string variables memory is deallocated ( set var->mbv_storage = NULL;)
>>
>> 3) because of step 2). btl_openib.so shared lib dlclosed
>>
>> 4) Now we are reopening openib in yoda and registering the openib
>> variables again.
>> main > start_pes> shmem_init > oshmem_shmem_init -> _shmem_init ->
>> mca_base_framework_open -> mca_spml_base_open>
>> mca_spml_yoda_component_open-> ..... mca_bml_base_open... ->
>> btl_openib_component_register -> register_variables()
>>
>> * In register_variables(), var_find() finds this variable( from the same
>> old group: 114) and reset the variables.
>> * For string variables, it allocated the buffers again (
>> (var->mbv_storage = PTR)
>> * note that group:114 is not belongs to yoda component.
>>
>> 5) In yoda component close, it never finds above group(114) because this
>> is not belongs to this component. So, do not call
>> mca_base_var_group_deregister() again on the var group. string var memory
>> is not deallocated.
>> main > start_pes> shmem_init > oshmem_shmem_init -> _shmem_init ->
>> mca_spml_base_select ->..> mca_spml_yoda_component_close ->
>> mca_bml_base_close -> mca_base_var_group_find().
>>
>> 6) because of step 5), the btl_openib.so is dlclosed(). This step
>> invalidates, all openib string vars memory ( var->mbv_storage = PTR)
>> allocated in step 4)
>>
>> 7) in ompi_mpi_finalize(), it will loop through all vars and finalizes
>> and deallocate the string var memory (var->mbv_storage = PTR)
>> ompi_mpi_finalize >...> mca_base_var_finalize * var->mbv_storage = PTR is
>> invalid at this stage and causing the SEGFAULT.
>>
>>
>> This also explains why Dinar's patch, kostul_fix.patch
>> (http://bgate.mellanox.com/redmine/attachments/1643/kostul_fix.patch),
>> resolves the issue. His patch prevents you from finding the invalid
>> already opened params.
>> So, I see in a lot of these registration functions the signature has an
>> entry for the project name, but now, NULL, is always passed. I see a note
>> by Nathan in
>>
>> ../opal/mca/base/mca_base_var.c +1311
>> {
>> /* XXX -- component_update -- We will stash the project name in the
>> component */
>> return mca_base_var_register (NULL, component->mca_type_name,
>>
>>
>> Seems knowing the project name, oshmem, would allow us to distinguish
>> between the different BMLs.
>>
>> Nathan, please advise.
>>
>> Josh
>>
>>
>> -----Original Message-----
>> From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Nathan Hjelm
>> Sent: Monday, December 16, 2013 12:44 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] bug in mca framework?
>>
>> On Mon, Dec 16, 2013 at 05:21:05PM +0000, Joshua Ladd wrote:
>>> After speaking with Igor Ivanov about this this morning, he summarized
>>> his findings as follows:
>>>
>>> 1. Valgrind comes up clean.
>> Thats good to hear but unfortunate since this seems really like a
>> stomping-on-memory problem.
>>
>>> 2. The issue is not reproduced with a static build.
>> This is a red-herring. The variable itself contains garbage. The
>> mbv_storage pointer looked like it was on the stack, the name was not
>> valid, etc. Not sure how we got an mca_base_var_t into that state since
>> the only time we touch anything in them is in mca_base_var_finalize. That
>> functions cleans up all of the state to two calls to it should be
>> harmless.
>>
>>> 3. A bisection study reveals that problems first appear after commit:
>>> https://svn.open-mpi.org/trac/ompi/changeset/28800/trunk/opal/mca/base
>>> /mca_base_var.c
>> Possibly also a coincidence. That commit only 1) moves the group stuff
>> into its own file, and 2) adds the mca_base_pvar interface. Its possible
>> I messed something up in the rest of the code but unlikely. I will take
>> another look though.
>>
>> -Nathan
>>
>>>
>>> Josh
>>>
>>> -----Original Message-----
>>> From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Jeff
>>> Squyres (jsquyres)
>>> Sent: Monday, December 16, 2013 12:15 PM
>>> To: Open MPI Developers
>>> Subject: Re: [OMPI devel] bug in mca framework?
>>>
>>> It might be worthwhile to run this through valgrind and see if
>>> something is being freed incorrectly...?
>>>
>>>
>>> On Dec 16, 2013, at 12:11 PM, Nathan Hjelm <hjelmn_at_[hidden]> wrote:
>>>
>>>> I took a look at the stacktraces last week and could not identify
>>>> where the bug is. I will dig deeper this week and see if I can come
>>> up with the correct fix.
>>>> -Nathan
>>>>
>>>> On Mon, Dec 09, 2013 at 03:17:36PM +0200, Mike Dubman wrote:
>>>>> Nathan,
>>>>> Could you please comment on the Igor`s observations?
>>>>> Thanks
>>>>>
>>>>> On Wed, Dec 4, 2013 at 4:44 PM, Igor Ivanov
>>> <igor.ivanov_at_[hidden]>
>>>>> wrote:
>>>>>
>>>>> On 04.12.2013 17:56, Jeff Squyres (jsquyres) wrote:
>>>>>
>>>>> On Dec 4, 2013, at 2:52 AM, Igor Ivanov
>>> <Igor.Ivanov_at_[hidden]>
>>>>> wrote:
>>>>>
>>>>> It is the first mca variable with type as string from
>>> btl/openib as
>>>>> 'device_param_files'. Actually you can disable it and get
>>> failure on
>>>>> the second.
>>>>>
>>>>> Description of case we see:
>>>>> 1. openib mca variables are registered during startup as
>>> stage at
>>>>> select component phase;
>>>>> 2. but a winner is cm component and openib mca variables are
>>>>> deregistered as part of mca group;
>>>>> 3. mca variables are not removed from global mca array but
>>> they
>>>>> marked as invalid and memory for string is freed;
>>>>> 4. shmem needs openib for yoda and does bml initialization;
>>>>> 5. openib mca variables are registered againusing light mode
>>> as
>>>>> searching itself in global array and refreshing their
>>>>> fields again;
>>>>>
>>>>> Can you explain what you mean by step 5? I.e., what does
>>> "using light
>>>>> mode" mean? Is the openib component register function invoked
>>> again?
>>>>> It is correct, it is called twice. "light mode" means that
>>>>> mca_base_var_register() does not allocate mca variable object
>>> again, it
>>>>> seeks this variable in global array and finding it updates
>>> fields in
>>>>> mca_base_var_t structure (at least mbv_storage).
>>>>>
>>>>> 6. for unknown reason bml finalization does not clean these
>>> vars as
>>>>> it is done in step 2;
>>>>> 7. mca_btl_openib.so is unloaded;
>>>>> 8. opal_finalize() destroys mca variables form global array,
>>>>> observes openib`s variable, try destroy using non accessed
>>>>> address;
>>>>>
>>>>> So a code that is under discussion fixes step 6.
>>>>>
>>>>> Nathan: it sounds like an MCA var (and entire group) is
>>> registered,
>>>>> unregistered, and then registered again. Does the MCA var
>>> system get
>>>>> confused here when it tries to unregister the group a 2nd time?
>>>>>
>>>>> Probably issue relates incorrect recognition if variable
>>> valid/invalid
>>>>> during second call of mca_base_var_deregister().
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> --
> Brian W. Barrett
> Scalable System Software Group
> Sandia National Laboratories
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



  • text/plain attachment: stored