Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r17398
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-02-09 08:12:01


Terry, George, and I had a long conversation about this on the phone
yesterday. I put up all the notes here:

     https://svn.open-mpi.org/trac/ompi/ticket/1207#comment:9

Short version:

1. We can add a (char*) to the base endpoint struct for the BTL/MTL to
fill in to get the basic information about interface names (e.g.,
"eth0", "mthca0:1", etc.). It should require no PML/BTL/MTL interface
changes, but will require a little new logic in each PML/BTL/MTL (they
are free to leave the value as NULL if the information is unavailable/
unsupported).

2. A new "preconnect all" functionality is also planned that will much
mo betta that then current one.

3. How accurate the information is in the "print the map"
functionality depends on whether preconnect has been invoked or not.
In many (most?) cases (E.g., a homogeneous cluster with one high speed
network), the "best guess" information that is available before the
preconnect is likely good enough.

See the ticket for more details.

On Feb 7, 2008, at 4:46 PM, George Bosilca wrote:

> Unfortunately, without the preconnect flag set the output is
> inaccurate.
> As the connections between peers are made in a lazy way, in the best
> case you will see the BTL that is supposed to be used, and not the one
> that will be used.
>
> In order to give an accurate view of the connections, you will have to
> force all the connections. This is currently impossible. There is
> nothing that guarantee that all connections have been established. The
> most probable way to achieve this is to send a small and then a large
> message between all peers.
>
> The main problem we have with showing the connection table is that
> there
> is no requirement in the add_proc to connect to the remote proc. What
> you get from add_proc is a bitmap of possible connections. The real
> connection is established only when a message is sent to the peer.
>
> In one of your example with openib and tcp, where only openib was
> showed
> (because of the exclusivity), the PML will automatically reroute the
> traffic over the TCP if the first send over openib fails. openib might
> be available it do not necessarily means it will be used.
>
> Thanks,
> george.
>
> jjhursey_at_[hidden] wrote:
>> Author: jjhursey
>> Date: 2008-02-07 10:28:28 EST (Thu, 07 Feb 2008)
>> New Revision: 17398
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/17398
>>
>> Log:
>> A quick try at ticket refs #1207.
>>
>> Here we are processing the BML structure attached to ompi_proc_t
>> well after
>> add_procs has been called.
>>
>> Currently only Rank 0 displays data, and makes no attempt to gather
>> information
>> from other ranks. I still need to add the MCA parameters to enable/
>> disable this
>> feature along with a bunch of other stuff.
>>
>> Examples from this commit on 2 nodes of IU's Odin Machine:
>>
>> {{{
>> shell$ mpirun -np 6 -mca btl tcp,sm,self hello
>> [odin001.cs.indiana.edu:28548] Connected to Process 0 on odin001
>> via: self
>> [odin001.cs.indiana.edu:28548] Connected to Process 1 on odin001
>> via: sm
>> [odin001.cs.indiana.edu:28548] Connected to Process 2 on odin001
>> via: sm
>> [odin001.cs.indiana.edu:28548] Connected to Process 3 on odin001
>> via: sm
>> [odin001.cs.indiana.edu:28548] Connected to Process 4 on odin002
>> via: tcp
>> [odin001.cs.indiana.edu:28548] Connected to Process 4 on odin002
>> via: tcp
>> [odin001.cs.indiana.edu:28548] Connected to Process 5 on odin002
>> via: tcp
>> [odin001.cs.indiana.edu:28548] Connected to Process 5 on odin002
>> via: tcp
>> [odin001.cs.indiana.edu:28548] Unique connection types: self,sm,tcp
>> (Hello World) I am 0 of 6 running on odin001.cs.indiana.edu (PID
>> 28548)
>> (Hello World) I am 1 of 6 running on odin001.cs.indiana.edu (PID
>> 28549)
>> (Hello World) I am 2 of 6 running on odin001.cs.indiana.edu (PID
>> 28550)
>> (Hello World) I am 3 of 6 running on odin001.cs.indiana.edu (PID
>> 28551)
>> (Hello World) I am 4 of 6 running on odin002.cs.indiana.edu (PID
>> 7809)
>> (Hello World) I am 5 of 6 running on odin002.cs.indiana.edu (PID
>> 7810)
>> }}}
>>
>> In this example you can see that we have 2 tcp connections to
>> odin002 for each
>> process, since Odin has 2 tcp interfaces to each machine.
>>
>> {{{
>> shell$ mpirun -np 6 -mca btl tcp,sm,openib,self hello
>> [odin001.cs.indiana.edu:28566] Connected to Process 0 on odin001
>> via: self
>> [odin001.cs.indiana.edu:28566] Connected to Process 1 on odin001
>> via: sm
>> [odin001.cs.indiana.edu:28566] Connected to Process 2 on odin001
>> via: sm
>> [odin001.cs.indiana.edu:28566] Connected to Process 3 on odin001
>> via: sm
>> [odin001.cs.indiana.edu:28566] Connected to Process 4 on odin002
>> via: openib
>> [odin001.cs.indiana.edu:28566] Connected to Process 5 on odin002
>> via: openib
>> [odin001.cs.indiana.edu:28566] Unique connection types:
>> self,sm,openib
>> (Hello World) I am 0 of 6 running on odin001.cs.indiana.edu (PID
>> 28566)
>> (Hello World) I am 1 of 6 running on odin001.cs.indiana.edu (PID
>> 28567)
>> (Hello World) I am 2 of 6 running on odin001.cs.indiana.edu (PID
>> 28568)
>> (Hello World) I am 3 of 6 running on odin001.cs.indiana.edu (PID
>> 28569)
>> (Hello World) I am 4 of 6 running on odin002.cs.indiana.edu (PID
>> 7820)
>> (Hello World) I am 5 of 6 running on odin002.cs.indiana.edu (PID
>> 7821)
>> }}}
>>
>> The above also occurs when passing no mca arguments. But here you
>> can see that
>> tcp is not being used due to exclucivity rules in Open MPI. So even
>> though
>> we specified {{{-mca btl tcp,sm,openib,self}}} only
>> {{{self,sm,openib}}} are
>> being used.
>>
>>
>>
>> Text files modified:
>> tmp-public/connect-map/ompi/runtime/ompi_mpi_init.c | 6 +++
>> tmp-public/connect-map/ompi/runtime/ompi_mpi_params.c | 65 ++++
>> ++++++++++++++++++++++++++++++++++++
>> tmp-public/connect-map/ompi/runtime/params.h | 8 ++++
>> 3 files changed, 79 insertions(+), 0 deletions(-)
>>
>> Modified: tmp-public/connect-map/ompi/runtime/ompi_mpi_init.c
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- tmp-public/connect-map/ompi/runtime/ompi_mpi_init.c (original)
>> +++ tmp-public/connect-map/ompi/runtime/ompi_mpi_init.c 2008-02-07
>> 10:28:28 EST (Thu, 07 Feb 2008)
>> @@ -749,6 +749,12 @@
>> opal_progress_set_event_poll_rate(value);
>> }
>>
>> +
>> + /*
>> + * Display connectivity information
>> + */
>> + ompi_show_connectivity_info();
>> +
>> /* At this point, we are fully configured and in MPI mode. Any
>> communication calls here will work exactly like they would in
>> the user's code. Setup the connections between procs and warm
>>
>> Modified: tmp-public/connect-map/ompi/runtime/ompi_mpi_params.c
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- tmp-public/connect-map/ompi/runtime/ompi_mpi_params.c (original)
>> +++ tmp-public/connect-map/ompi/runtime/ompi_mpi_params.c
>> 2008-02-07 10:28:28 EST (Thu, 07 Feb 2008)
>> @@ -33,6 +33,10 @@
>> #include "opal/util/show_help.h"
>> #include "opal/mca/base/mca_base_param.h"
>>
>> +#include "orte/util/proc_info.h"
>> +#include "ompi/proc/proc.h"
>> +#include "ompi/mca/bml/bml.h"
>> +
>> /*
>> * Global variables
>> *
>> @@ -335,3 +339,64 @@
>>
>> return OMPI_SUCCESS;
>> }
>> +
>> +
>> +int ompi_show_connectivity_info(void)
>> +{
>> + int exit_status = OMPI_SUCCESS;
>> + ompi_proc_t** procs = NULL;
>> + size_t nprocs, p, nbtls, b;
>> + char *unique_set = NULL;
>> + size_t new_size;
>> +
>> + /* JJH: Add MCA parameter here */
>> +
>> + if( 0 == ORTE_PROC_MY_NAME->vpid ) {
>> + /* Get all ompi_proc_t's */
>> + if (NULL == (procs = ompi_proc_world(&nprocs))) {
>> + opal_output(0, "ompi_proc_world() failed\n");
>> + goto cleanup;
>> + }
>> +
>> + for(p = 0; p < nprocs; ++p ) {
>> + if( NULL != procs[p]->proc_bml ) {
>> + mca_bml_base_btl_t *bml_btl = NULL;
>> + nbtls = mca_bml_base_btl_array_get_size(&(procs[p]-
>> >proc_bml->btl_send));
>> +
>> + /* For each btl */
>> + for(b = 0; b < nbtls; ++b) {
>> + char *component_name = NULL;
>> + bml_btl =
>> mca_bml_base_btl_array_get_index(&(procs[p]->proc_bml->btl_send), b);
>> + component_name = strdup(bml_btl->btl-
>> >btl_component->btl_version.mca_component_name);
>> +
>> + opal_output(0, "Connected to Process %ld on %s
>> via: %s\n",
>> + (long)procs[p]->proc_name.vpid,
>> + procs[p]->proc_hostname,
>> + component_name);
>> +
>> + if( NULL == unique_set ) {
>> + unique_set = strdup(component_name);
>> + }
>> + else {
>> + /* Add this component if it is not already
>> included */
>> + if( NULL == strstr(unique_set,
>> component_name) ) {
>> + new_size =
>> sizeof(char)*(strlen(unique_set) + strlen(component_name) + 1);
>> + unique_set = (char
>> *)realloc(unique_set, new_size);
>> +
>> + strncat(unique_set, ",", strlen(","));
>> + strncat(unique_set,
>> + component_name,
>> + strlen(component_name));
>> + }
>> + }
>> + }
>> + }
>> + }
>> + opal_output(0, "Unique connection types: %s\n",
>> + unique_set);
>> + free(unique_set);
>> + }
>> +
>> + cleanup:
>> + return exit_status;
>> +}
>>
>> Modified: tmp-public/connect-map/ompi/runtime/params.h
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> --- tmp-public/connect-map/ompi/runtime/params.h (original)
>> +++ tmp-public/connect-map/ompi/runtime/params.h 2008-02-07
>> 10:28:28 EST (Thu, 07 Feb 2008)
>> @@ -170,6 +170,14 @@
>> */
>> int ompi_show_all_mca_params(int32_t, int, char *);
>>
>> +/**
>> + * Display Connectivity information
>> + *
>> + * @returns OMPI_SUCCESS
>> + *
>> + */
>> +int ompi_show_connectivity_info(void);
>> +
>> END_C_DECLS
>>
>> #endif /* OMPI_RUNTIME_PARAMS_H */
>> _______________________________________________
>> svn-full mailing list
>> svn-full_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems