Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: bug in mca framework?
From: Igor Ivanov (igor.ivanov_at_[hidden])
Date: 2014-01-15 09:45:20


Brian,

Sorry for slow reaction.
I am not sure I understand your concern. Could you please make it
clearer and review modified patch (I have figured out issue in my
previous patch as absence of complete btl initialization in case PML
components different from bfo and ob1 needed for OSHMEM.)

Igor

On 03.01.2014 00:04, Barrett, Brian W wrote:
> Igor -
>
> Sorry for the slow reply; I was on vacation for the last week and a half.
>
> The patch doesn't look quite right to me. If the cm PML is used, the spml
> (or something else in the OSHMEM layer) is going to have to call add_procs
> on the BML to initialize the procs arrays for the BTLs.
>
> Brian
>
> On 12/23/13 3:49 AM, "Igor Ivanov" <igor.ivanov_at_[hidden]> wrote:
>
>> Brian,
>>
>> Could you look at patch based on your suggestion. It resolves the issue
>> with mca variable.
>>
>> Igor
>>
>> On 18.12.2013 01:48, Barrett, Brian W wrote:
>>> The proposed solution at the bottom is wrong. There aren't two
>>> different
>>> BMLs, there's one, and it lives in OMPI.
>>>
>>> The solution is to open the bml and btls in ompi_mpi_init and not in the
>>> pmls. I checked, and the bml will deal with add_procs being called
>>> multiple times on the same proc, so just moving the framework open /
>>> init
>>> is sufficient. This will also solve the MTL problem.
>>>
>>> Brian
>>>
>>> On 12/17/13 8:33 AM, "Joshua Ladd" <joshual_at_[hidden]> wrote:
>>>
>>>> I believe Devendar Bureddy nailed the root cause. I am providing his
>>>> excellent analysis below:
>>>>
>>> >From Devendar:
>>>> with curiosity i looked at this issue. here's my 2 cents
>>>> I think issue is because of BTL components is opened&closed
>>>> twice(ompi_init, yoda) which leading to incorrect usage of var groups.
>>>> The following sequence of events creating invalid memory
>>>>
>>>> 1) all openib component parameters registered in ompi_mpi_init
>>>> main > start_pes> shmem_init -> oshmem_shmem_init -> ompi_mpi_init ->
>>>> mca_base_framework_open -> mca_pml_base_open ..... mca_bml_base_open...
>>>> -> btl_openib_component_register()
>>>>
>>>> * for all string variables it allocated a memory block
>>>> (var->mbv_storage
>>>> = PTR)
>>>>
>>>> At this time a new var group id:114 (of parent group id: 112) is
>>>> created
>>>> for all openib component variables.
>>>>
>>>> 2) This var group is de-registered in ompi_mpi_init. It marks all
>>>> variables as invalid. but, the group&vars is still exist
>>>> main > start_pes> shmem_init -> oshmem_shmem_init ->
>>>> mca_pml_base_select
>>>> -> mca_base_components_close -> ... -> mca_bml_base_close ->
>>>> mca_base_framework_close -> mca_base_var_group_deregister(groupid:
>>>> 114) *
>>>> all string variables memory is deallocated ( set var->mbv_storage =
>>>> NULL;)
>>>>
>>>> 3) because of step 2). btl_openib.so shared lib dlclosed
>>>>
>>>> 4) Now we are reopening openib in yoda and registering the openib
>>>> variables again.
>>>> main > start_pes> shmem_init > oshmem_shmem_init -> _shmem_init ->
>>>> mca_base_framework_open -> mca_spml_base_open>
>>>> mca_spml_yoda_component_open-> ..... mca_bml_base_open... ->
>>>> btl_openib_component_register -> register_variables()
>>>>
>>>> * In register_variables(), var_find() finds this variable( from the
>>>> same
>>>> old group: 114) and reset the variables.
>>>> * For string variables, it allocated the buffers again (
>>>> (var->mbv_storage = PTR)
>>>> * note that group:114 is not belongs to yoda component.
>>>>
>>>> 5) In yoda component close, it never finds above group(114) because
>>>> this
>>>> is not belongs to this component. So, do not call
>>>> mca_base_var_group_deregister() again on the var group. string var
>>>> memory
>>>> is not deallocated.
>>>> main > start_pes> shmem_init > oshmem_shmem_init -> _shmem_init ->
>>>> mca_spml_base_select ->..> mca_spml_yoda_component_close ->
>>>> mca_bml_base_close -> mca_base_var_group_find().
>>>>
>>>> 6) because of step 5), the btl_openib.so is dlclosed(). This step
>>>> invalidates, all openib string vars memory ( var->mbv_storage = PTR)
>>>> allocated in step 4)
>>>>
>>>> 7) in ompi_mpi_finalize(), it will loop through all vars and finalizes
>>>> and deallocate the string var memory (var->mbv_storage = PTR)
>>>> ompi_mpi_finalize >...> mca_base_var_finalize * var->mbv_storage = PTR
>>>> is
>>>> invalid at this stage and causing the SEGFAULT.
>>>>
>>>>
>>>> This also explains why Dinar's patch, kostul_fix.patch
>>>> (http://bgate.mellanox.com/redmine/attachments/1643/kostul_fix.patch),
>>>> resolves the issue. His patch prevents you from finding the invalid
>>>> already opened params.
>>>> So, I see in a lot of these registration functions the signature has an
>>>> entry for the project name, but now, NULL, is always passed. I see a
>>>> note
>>>> by Nathan in
>>>>
>>>> ../opal/mca/base/mca_base_var.c +1311
>>>> {
>>>> /* XXX -- component_update -- We will stash the project name in the
>>>> component */
>>>> return mca_base_var_register (NULL, component->mca_type_name,
>>>>
>>>>
>>>> Seems knowing the project name, oshmem, would allow us to distinguish
>>>> between the different BMLs.
>>>>
>>>> Nathan, please advise.
>>>>
>>>> Josh
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Nathan
>>>> Hjelm
>>>> Sent: Monday, December 16, 2013 12:44 PM
>>>> To: Open MPI Developers
>>>> Subject: Re: [OMPI devel] bug in mca framework?
>>>>
>>>> On Mon, Dec 16, 2013 at 05:21:05PM +0000, Joshua Ladd wrote:
>>>>> After speaking with Igor Ivanov about this this morning, he summarized
>>>>> his findings as follows:
>>>>>
>>>>> 1. Valgrind comes up clean.
>>>> Thats good to hear but unfortunate since this seems really like a
>>>> stomping-on-memory problem.
>>>>
>>>>> 2. The issue is not reproduced with a static build.
>>>> This is a red-herring. The variable itself contains garbage. The
>>>> mbv_storage pointer looked like it was on the stack, the name was not
>>>> valid, etc. Not sure how we got an mca_base_var_t into that state since
>>>> the only time we touch anything in them is in mca_base_var_finalize.
>>>> That
>>>> functions cleans up all of the state to two calls to it should be
>>>> harmless.
>>>>
>>>>> 3. A bisection study reveals that problems first appear after commit:
>>>>> https://svn.open-mpi.org/trac/ompi/changeset/28800/trunk/opal/mca/base
>>>>> /mca_base_var.c
>>>> Possibly also a coincidence. That commit only 1) moves the group stuff
>>>> into its own file, and 2) adds the mca_base_pvar interface. Its
>>>> possible
>>>> I messed something up in the rest of the code but unlikely. I will take
>>>> another look though.
>>>>
>>>> -Nathan
>>>>
>>>>> Josh
>>>>>
>>>>> -----Original Message-----
>>>>> From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Jeff
>>>>> Squyres (jsquyres)
>>>>> Sent: Monday, December 16, 2013 12:15 PM
>>>>> To: Open MPI Developers
>>>>> Subject: Re: [OMPI devel] bug in mca framework?
>>>>>
>>>>> It might be worthwhile to run this through valgrind and see if
>>>>> something is being freed incorrectly...?
>>>>>
>>>>>
>>>>> On Dec 16, 2013, at 12:11 PM, Nathan Hjelm <hjelmn_at_[hidden]> wrote:
>>>>>
>>>>>> I took a look at the stacktraces last week and could not identify
>>>>>> where the bug is. I will dig deeper this week and see if I can come
>>>>> up with the correct fix.
>>>>>> -Nathan
>>>>>>
>>>>>> On Mon, Dec 09, 2013 at 03:17:36PM +0200, Mike Dubman wrote:
>>>>>>> Nathan,
>>>>>>> Could you please comment on the Igor`s observations?
>>>>>>> Thanks
>>>>>>>
>>>>>>> On Wed, Dec 4, 2013 at 4:44 PM, Igor Ivanov
>>>>> <igor.ivanov_at_[hidden]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On 04.12.2013 17:56, Jeff Squyres (jsquyres) wrote:
>>>>>>>
>>>>>>> On Dec 4, 2013, at 2:52 AM, Igor Ivanov
>>>>> <Igor.Ivanov_at_[hidden]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> It is the first mca variable with type as string from
>>>>> btl/openib as
>>>>>>> 'device_param_files'. Actually you can disable it and get
>>>>> failure on
>>>>>>> the second.
>>>>>>>
>>>>>>> Description of case we see:
>>>>>>> 1. openib mca variables are registered during startup as
>>>>> stage at
>>>>>>> select component phase;
>>>>>>> 2. but a winner is cm component and openib mca variables
>>>>>>> are
>>>>>>> deregistered as part of mca group;
>>>>>>> 3. mca variables are not removed from global mca array but
>>>>> they
>>>>>>> marked as invalid and memory for string is freed;
>>>>>>> 4. shmem needs openib for yoda and does bml initialization;
>>>>>>> 5. openib mca variables are registered againusing light
>>>>>>> mode
>>>>> as
>>>>>>> searching itself in global array and refreshing their
>>>>>>> fields again;
>>>>>>>
>>>>>>> Can you explain what you mean by step 5? I.e., what does
>>>>> "using light
>>>>>>> mode" mean? Is the openib component register function
>>>>>>> invoked
>>>>> again?
>>>>>>> It is correct, it is called twice. "light mode" means that
>>>>>>> mca_base_var_register() does not allocate mca variable object
>>>>> again, it
>>>>>>> seeks this variable in global array and finding it updates
>>>>> fields in
>>>>>>> mca_base_var_t structure (at least mbv_storage).
>>>>>>>
>>>>>>> 6. for unknown reason bml finalization does not clean these
>>>>> vars as
>>>>>>> it is done in step 2;
>>>>>>> 7. mca_btl_openib.so is unloaded;
>>>>>>> 8. opal_finalize() destroys mca variables form global
>>>>>>> array,
>>>>>>> observes openib`s variable, try destroy using non accessed
>>>>>>> address;
>>>>>>>
>>>>>>> So a code that is under discussion fixes step 6.
>>>>>>>
>>>>>>> Nathan: it sounds like an MCA var (and entire group) is
>>>>> registered,
>>>>>>> unregistered, and then registered again. Does the MCA var
>>>>> system get
>>>>>>> confused here when it tries to unregister the group a 2nd
>>>>>>> time?
>>>>>>>
>>>>>>> Probably issue relates incorrect recognition if variable
>>>>> valid/invalid
>>>>>>> during second call of mca_base_var_deregister().
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>> --
>>> Brian W. Barrett
>>> Scalable System Software Group
>>> Sandia National Laboratories
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>
> --
> Brian W. Barrett
> Scalable System Software Group
> Sandia National Laboratories
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



  • text/plain attachment: stored