Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] opal_process_info.job_session_dir: "not yet defined"
From: George Bosilca (bosilca_at_[hidden])
Date: 2014-07-28 14:23:06


Ignore my previous email, I see what is going on. Basically, there are 6
data made available to the BTL: nodename, job_session_dir,
proc_session_dir, num_local_peers, my_local_rank and if available cpuset.
Some of this information is available early in the startup while others are
only available after the modex exchange.

Right now the initialization of all these info is made after the modex
exchange. We can certainly move some of them earlier, maybe right after the
RTE initialization. As Ralph said, I requested his help on this as he is in
the best position to know when the RTE can provide such information.

Patience ...

  George.

On Mon, Jul 28, 2014 at 1:38 PM, George Bosilca <bosilca_at_[hidden]> wrote:

> Well, I'm slightly confused as the BTL are initialized outside opal_init.
> There must be a specific call to mca_base_framework_open for the BTL, and
> currently this call is made in the BML. As the BML is only initialized once
> the RTE is up, I don't understand how do you get the "not initialized".
>
> George.
>
>
>
> On Mon, Jul 28, 2014 at 1:29 PM, Jeff Squyres (jsquyres) <
> jsquyres_at_[hidden]> wrote:
>
>> I'd be ok with that.
>>
>> George?
>>
>>
>> On Jul 28, 2014, at 1:20 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>> > I think we should not have opal_init setup the BTLs at all. Let's leave
>> that for the RTE setup to do as it can control the sequencing to ensure all
>> the data is available and ready
>> >
>> > On Jul 28, 2014, at 10:21 AM, Jeff Squyres (jsquyres) <
>> jsquyres_at_[hidden]> wrote:
>> >
>> >> Well, this is a pickle.
>> >>
>> >> I'm setting up component-wide resources in the BTL component init. I
>> am doing this because the creation of the modules that I return from BTL
>> component init (currently) *assume* that all of these component resources
>> are already setup.
>> >>
>> >> If I have to defer setting up component-wide resources until later,
>> this means I have to put a conditional in some critical paths, right? I.e.,
>> >>
>> >> if (component_stuff_not_setup_yet)
>> >> do_component_setup()
>> >>
>> >> Yuck.
>> >>
>> >> Is there a better way?
>> >>
>> >> Crazy idea: should we add more hooks during the init / setup sequence?
>> E.g., a BTL component_init_after_rte_has_been_initialized() that is
>> guaranteed to be called before any module functions are invoked?
>> >>
>> >>
>> >>
>> >> On Jul 28, 2014, at 1:10 PM, George Bosilca <bosilca_at_[hidden]>
>> wrote:
>> >>
>> >>> This means you are trying to initialize things too early. Most of the
>> information made available in opal/util/proc.h is only available once the
>> RTE was setup, i.e. only after the call to rte_init. Thus, the BTL can only
>> use it after the init call...
>> >>>
>> >>> George.
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Jul 28, 2014 at 1:01 PM, Ralph Castain <rhc_at_[hidden]>
>> wrote:
>> >>>
>> >>> On Jul 28, 2014, at 9:57 AM, Jeff Squyres (jsquyres) <
>> jsquyres_at_[hidden]> wrote:
>> >>>
>> >>>> I'm getting a value of "not yet defined" for
>> opal_process_info.job_session_dir in the usnic BTL (is this also what is
>> happening for
>> http://www.open-mpi.org/community/lists/devel/2014/07/15276.php?).
>> >>>>
>> >>>> Can the job_session_dir be define/setup before the BTLs are setup?
>> >>>
>> >>> Yes, but the BTL setup can't be done in opal_init - it'll have to be
>> the responsibility of the RTE layer to first set things up, and then init
>> the BTLs. George asked me to look into this, and I will - just slammed
>> today and so can't get to it until later this afternoon
>> >>>
>> >>>>
>> >>>> --
>> >>>> Jeff Squyres
>> >>>> jsquyres_at_[hidden]
>> >>>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >>>>
>> >>>> _______________________________________________
>> >>>> devel mailing list
>> >>>> devel_at_[hidden]
>> >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >>>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15277.php
>> >>>
>> >>> _______________________________________________
>> >>> devel mailing list
>> >>> devel_at_[hidden]
>> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15279.php
>> >>>
>> >>> _______________________________________________
>> >>> devel mailing list
>> >>> devel_at_[hidden]
>> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15280.php
>> >>
>> >>
>> >> --
>> >> Jeff Squyres
>> >> jsquyres_at_[hidden]
>> >> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >>
>> >> _______________________________________________
>> >> devel mailing list
>> >> devel_at_[hidden]
>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15284.php
>> >
>> > _______________________________________________
>> > devel mailing list
>> > devel_at_[hidden]
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15285.php
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15286.php
>>
>
>