Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] bug in mca framework?
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-12-04 08:56:17


On Dec 4, 2013, at 2:52 AM, Igor Ivanov <Igor.Ivanov_at_[hidden]> wrote:

> It is the first mca variable with type as string from btl/openib as 'device_param_files'. Actually you can disable it and get failure on the second.
>
> Description of case we see:
> 1. openib mca variables are registered during startup as stage at select component phase;
> 2. but a winner is cm component and openib mca variables are deregistered as part of mca group;
> 3. mca variables are not removed from global mca array but they marked as invalid and memory for string is freed;
> 4. shmem needs openib for yoda and does bml initialization;
> 5. openib mca variables are registered againusing light mode as searching itself in global array and refreshing their fields again;

Can you explain what you mean by step 5? I.e., what does "using light mode" mean? Is the openib component register function invoked again?

> 6. for unknown reason bml finalization does not clean these vars as it is done in step 2;
> 7. mca_btl_openib.so is unloaded;
> 8. opal_finalize() destroys mca variables form global array, observes openib`s variable, try destroy using non accessed address;
>
> So a code that is under discussion fixes step 6.

Nathan: it sounds like an MCA var (and entire group) is registered, unregistered, and then registered again. Does the MCA var system get confused here when it tries to unregister the group a 2nd time?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/