Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-12-12 08:09:41


On Dec 12, 2007, at 5:13 AM, Pavel Shamis (Pasha) wrote:

>> Hmm. I don't think that we want to put knowledge of XRC in the OOB
>> CPC (and vice versa). That seems like an abstraction violation.
>>
>> I didn't like that XRC knowledge was put in the connect base either,
>> but I was too busy to argue with it. :-)
>>
>> Isn't there a better way somehow? Perhaps we should have "select"
>> call *all* the functions and accept back a priority. The one with
>> the
>> highest priority then wins. This is quite similar to much of the
>> other selection logic in OMPI.
>>
>> Sidenote: Keep in mind that there are some changes coming to select
>> CPCs on a per-endpoint basis (I can't look up the trac ticket right
>> now...). This makes things a little complicated -- do we need
>> btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to
>> include/exclude CPCs (because you might need more than one CPC in a
>> single job)? That wouldn't be hard to do.
>>
>> But then what to do about if someone sets to use some XRC QPs and
>> selects to use OOB or RDMA CM?
> Error message will be reported , that for using XRC you _must_
> select xoob.

I understand that that is what it does today; I was asking my somewhat-
rhetorical question with the above text in mind (that we remove the
abstraction violations -- remove knowledge of XRC from the OOB CPC,
etc.).

>> How do we catch this and print an
>> error? It doesn't seem right to put the "if num_xrc_qps>0" check in
>> every CPC. What happens if you try to make an XRC QP when not using
>> xoob?
>
>> Where is the error detected and what kind of error message do
>> we print?
>>
> I would like to remind 2 things:
> 1. XRC little bit change QP logic. We creates one XRC qp for send and
> one for recv. As result
> it require absolutely different oob mechanism.
> 2. Current implementation doesn't allow to run with XRC + RC (or srq)
> and I don't think that we need such mixed configuration
> support at all.
>
> So as results the the XRC may work only with XOOB. If you will try to
> run it with oob error message will be reported.
> As well if you will try to run !(XRC) with XOOB error message will be
> reported too.
>
> The check is located in ompi_btl_openib_connect_base_open.

I understand all of that. I think the question is if there is a way
to de-centralize these checks such that the XOOB CPC can be the one
that figures this stuff out (for example) rather than having to put
this in the base.

> The original code in the function used oob as default connection
> method.
> I changed it to check
> in which mode we are running (xrc enabled/disabled) and make xoob
> default connection manager for xrc mode
> and oob make default for not xrc mode.

Right -- this is problematic for adding IBCM and RDMA CM; that's Jon's
point.

> I not sure that oob cpc is the best place for the logic.
> also I don't think that the "select + priority" solution will resolve
> the dependences problem:
> XRC enabled -> xoob
> XRC disabled -> oob , cm.
>
> We may push the logic outside of cpc and pass to
> ompi_btl_openib_connect_base_open()
> witch connection manger we want to use. I guess that the change also
> will be usefull for future "CPCs on a per-endpoint basis" changes.

 From an abstraction point of view, it would be nice to get all this
CPC-specific information out of the base and into the CPCs that they
belong to.

>> Also, I'm not sure why the #if/#else is there for xoob (i.e., having
>> empty/printf functions there when XRC support is compiled out) -- if
>> xoob was disabled during compilation, then it simply should not be
>> compiled and therefore not be there at all at run-time. If a user
>> selects the xoob CPC, then we should print a message from the base
>> that that CPC doesn't exist in the installation. Correspondingly, we
>> can make an info MCA param in the btl openib that shows which CPCs
>> are
>> available (we already have this information -- it's easy enough to
>> put
>> this in an info MCA param).
>>
> Sounds reasonable for me.
>
> Pasha.
>>
>> On Dec 11, 2007, at 6:59 PM, Jon Mason wrote:
>>
>>
>>> Currently, alternate CMs cannot be called because
>>> ompi_btl_openib_connect_base_open forces a choice of either oob or
>>> xoob
>>> (and goes into an erroneous error path if you pick something else).
>>> This patch reorganizes ompi_btl_openib_connect_base_open so that new
>>> functions can easily be added. New Open functions were added to oob
>>> and xoob for the error handling.
>>>
>>> I tested calling oob, xoob, and rdma_cm. oob happily allows
>>> connections
>>> to be established and throws no errors. xoob fails because ompi
>>> does
>>> not have it compiled in (and I have no connectx cards). rdma_cm
>>> calls
>>> the empty hooks and exits without connecting (thus throwing
>>> non-connection errors). All expected behavior.
>>>
>>> Since this patch fixes the existing behavior, and is not necessarily
>>> tied to my implementing of rdma_cm, I think it is acceptable to go
>>> in
>>> now.
>>>
>>> Thanks,
>>> Jon
>>>
>>> Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c
>>> ===================================================================
>>> --- ompi/mca/btl/openib/connect/btl_openib_connect_base.c (revision
>>> 16937)
>>> +++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c (working
>>> copy)
>>> @@ -50,8 +50,8 @@
>>> */
>>> int ompi_btl_openib_connect_base_open(void)
>>> {
>>> - int i;
>>> - char **temp, *a, *b;
>>> + char **temp, *a, *b, *defval;
>>> + int i, ret = OMPI_ERROR;
>>>
>>> /* Make an MCA parameter to select which connect module to use */
>>> temp = NULL;
>>> @@ -66,40 +66,23 @@
>>>
>>> /* For XRC qps we must to use XOOB connection manager */
>>> if (mca_btl_openib_component.num_xrc_qps > 0) {
>>> -
>>> mca_base_param_reg_string
>>> (&mca_btl_openib_component.super.btl_version,
>>> - "connect",
>>> - b, false, false,
>>> - "xoob", &param);
>>> - if (0 != strcmp("xoob", param)) {
>>> - opal_show_help("help-mpi-btl-openib.txt",
>>> - "XRC with wrong OOB", true,
>>> - orte_system_info.nodename,
>>> - mca_btl_openib_component.num_xrc_qps);
>>> - return OMPI_ERROR;
>>> - }
>>> + defval = "xoob";
>>> } else { /* For all others we should use OOB */
>>> -
>>> mca_base_param_reg_string
>>> (&mca_btl_openib_component.super.btl_version,
>>> - "connect",
>>> - b, false, false,
>>> - "oob", &param);
>>> - if (0 != strcmp("oob", param)) {
>>> - opal_show_help("help-mpi-btl-openib.txt",
>>> - "SRQ or PP with wrong OOB", true,
>>> - orte_system_info.nodename,
>>> - mca_btl_openib_component.num_srq_qps,
>>> - mca_btl_openib_component.num_pp_qps);
>>> - return OMPI_ERROR;
>>> - }
>>> + defval = "oob";
>>> }
>>>
>>> +
>>> mca_base_param_reg_string
>>> (&mca_btl_openib_component.super.btl_version,
>>> + "connect", b, false, false, defval, &param);
>>> +
>>> /* Call the open function on all the connect modules */
>>> for (i = 0; NULL != all[i]; ++i) {
>>> - if (NULL != all[i]->bcf_open) {
>>> - all[i]->bcf_open();
>>> + if (0 == strcmp(all[i]->bcf_name, param)) {
>>> + ret = all[i]->bcf_open();
>>> + break;
>>> }
>>> }
>>>
>>> - return OMPI_SUCCESS;
>>> + return ret;
>>> }
>>>
>>>
>>> Index: ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c
>>> ===================================================================
>>> --- ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c (revision
>>> 16937)
>>> +++ ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c (working
>>> copy)
>>> @@ -28,11 +28,7 @@
>>>
>>> static int ibcm_open(void)
>>> {
>>> -
>>> mca_base_param_reg_int(&mca_btl_openib_component.super.btl_version,
>>> - "btl_openib_connect_ibcm_foo",
>>> - "A dummy help message", false, false,
>>> - 17, NULL);
>>> -
>>> + printf("ibcm open\n");
>>> return OMPI_SUCCESS;
>>> }
>>>
>>> Index: ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
>>> ===================================================================
>>> --- ompi/mca/btl/openib/connect/btl_openib_connect_oob.c (revision
>>> 16937)
>>> +++ ompi/mca/btl/openib/connect/btl_openib_connect_oob.c (working
>>> copy)
>>> @@ -22,6 +22,8 @@
>>>
>>> #include "ompi_config.h"
>>>
>>> +#include "opal/util/show_help.h"
>>> +
>>> #include "orte/mca/ns/base/base.h"
>>> #include "orte/mca/oob/base/base.h"
>>> #include "orte/mca/rml/rml.h"
>>> @@ -39,6 +41,7 @@
>>> ENDPOINT_CONNECT_ACK
>>> } connect_message_type_t;
>>>
>>> +static int oob_open(void);
>>> static int oob_init(void);
>>> static int oob_start_connect(mca_btl_base_endpoint_t *e);
>>> static int oob_finalize(void);
>>> @@ -67,8 +70,8 @@
>>> */
>>> ompi_btl_openib_connect_base_funcs_t ompi_btl_openib_connect_oob = {
>>> "oob",
>>> - /* No need for "open */
>>> - NULL,
>>> + /* Open */
>>> + oob_open,
>>> /* Init */
>>> oob_init,
>>> /* Connect */
>>> @@ -78,6 +81,23 @@
>>> };
>>>
>>> /*
>>> + * Open function.
>>> + */
>>> +static int oob_open(void)
>>> +{
>>> + if (mca_btl_openib_component.num_xrc_qps > 0) {
>>> + opal_show_help("help-mpi-btl-openib.txt",
>>> + "SRQ or PP with wrong OOB", true,
>>> + orte_system_info.nodename,
>>> + mca_btl_openib_component.num_srq_qps,
>>> + mca_btl_openib_component.num_pp_qps);
>>> + return OMPI_ERROR;
>>> + }
>>> +
>>> + return OMPI_SUCCESS;
>>> +}
>>> +
>>> +/*
>>> * Init function. Post non-blocking RML receive to accept incoming
>>> * connection requests.
>>> */
>>> Index: ompi/mca/btl/openib/connect/btl_openib_connect_rdma_cm.c
>>> ===================================================================
>>> --- ompi/mca/btl/openib/connect/btl_openib_connect_rdma_cm.c
>>> (revision 16937)
>>> +++ ompi/mca/btl/openib/connect/btl_openib_connect_rdma_cm.c
>>> (working copy)
>>> @@ -28,11 +28,7 @@
>>>
>>> static int rdma_cm_open(void)
>>> {
>>> -
>>> mca_base_param_reg_int(&mca_btl_openib_component.super.btl_version,
>>> - "btl_openib_connect_rdma_cm_foo",
>>> - "A dummy help message", false, false,
>>> - 17, NULL);
>>> -
>>> + printf("rdma cm open\n");
>>> return OMPI_SUCCESS;
>>> }
>>>
>>> Index: ompi/mca/btl/openib/connect/btl_openib_connect_xoob.c
>>> ===================================================================
>>> --- ompi/mca/btl/openib/connect/btl_openib_connect_xoob.c (revision
>>> 16937)
>>> +++ ompi/mca/btl/openib/connect/btl_openib_connect_xoob.c (working
>>> copy)
>>> @@ -10,6 +10,8 @@
>>>
>>> #include "ompi_config.h"
>>>
>>> +#include "opal/util/show_help.h"
>>> +
>>> #include "orte/mca/ns/base/base.h"
>>> #include "orte/mca/oob/base/base.h"
>>> #include "orte/mca/rml/rml.h"
>>> @@ -22,6 +24,7 @@
>>> #include "btl_openib_xrc.h"
>>> #include "connect/connect.h"
>>>
>>> +static int xoob_open(void);
>>> static int xoob_init(void);
>>> static int xoob_start_connect(mca_btl_base_endpoint_t *e);
>>> static int xoob_finalize(void);
>>> @@ -32,8 +35,8 @@
>>> */
>>> ompi_btl_openib_connect_base_funcs_t ompi_btl_openib_connect_xoob
>>> = {
>>> "xoob",
>>> - /* No need for "open */
>>> - NULL,
>>> + /* Open */
>>> + xoob_open,
>>> /* Init */
>>> xoob_init,
>>> /* Connect */
>>> @@ -99,7 +102,24 @@
>>>
>>> static int init_rem_info(mca_btl_openib_rem_info_t *rem_info);
>>> static void free_rem_info(mca_btl_openib_rem_info_t *rem_info);
>>> +
>>> /*
>>> + * Open function.
>>> + */
>>> +static int xoob_open(void)
>>> +{
>>> + if (mca_btl_openib_component.num_xrc_qps <= 0) {
>>> + opal_show_help("help-mpi-btl-openib.txt",
>>> + "XRC with wrong OOB", true,
>>> + orte_system_info.nodename,
>>> + mca_btl_openib_component.num_xrc_qps);
>>> + return OMPI_ERROR;
>>> + }
>>> +
>>> + return OMPI_SUCCESS;
>>> +}
>>> +
>>> +/*
>>> * Init function. Post non-blocking RML receive to accept incoming
>>> * connection requests.
>>> */
>>> @@ -834,6 +854,12 @@
>>>
>>> #else
>>> /* In case if the XRC was disabled during compilation we will print
>>> message and return error */
>>> +static int xoob_open(void)
>>> +{
>>> + printf("xoob open\n");
>>> + return OMPI_ERR_NOT_IMPLEMENTED;
>>> +}
>>> +
>>> static int xoob_init(void)
>>> {
>>> printf("xoob init\n");
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems