Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI process dies with a route error when using dynamic process calls to connect more than 2 clients to a server with InfiniBand
From: Philippe (philmpi_at_[hidden])
Date: 2010-08-19 14:12:58


Ralph,

somewhere in ./orte/mca/oob/tcp/oob_tcp.c, there is this comment:

                orte_node_rank_t nrank;
                /* do I know my node_local_rank yet? */
                if (ORTE_NODE_RANK_INVALID != (nrank =
orte_ess.get_node_rank(ORTE_PROC_MY_NAME)) &&
                    (nrank+1) <
opal_argv_count(mca_oob_tcp_component.tcp4_static_ports)) {
                    /* any daemon takes the first entry, so we start
with the second */

which seems constant with process #0 listening on 10001. the question
would be why process #1 attempt to connect to port 10000 then? or
maybe totally unrelated :-)

btw, if I trick process #1 to open the connection to 10001 by shifting
the range, I now get this error and the process terminate immediately:

[c0301b10e1:03919] [[0,9999],1]-[[0,0],0]
mca_oob_tcp_peer_recv_connect_ack: received unexpected process
identifier [[0,9999],0]

good luck with the surgery and wishing you a prompt recovery!

p.

On Thu, Aug 19, 2010 at 2:02 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> Something doesn't look right - here is what the algo attempts to do:
> given a port range of 10000-12000, the lowest rank'd process on the node
> should open port 10000. The next lowest rank on the node will open 10001,
> etc.
> So it looks to me like there is some confusion in the local rank algo. I'll
> have to look at the generic module - must be a bug in it somewhere.
> This might take a couple of days as I have surgery tomorrow morning, so
> please forgive the delay.
>
> On Thu, Aug 19, 2010 at 11:13 AM, Philippe <philmpi_at_[hidden]> wrote:
>>
>> Ralph,
>>
>> I'm able to use the generic module when the processes are on different
>> machines.
>>
>> what would be the values of the EV when two processes are on the same
>> machine (hopefully talking over SHM).
>>
>> i've played with combination of nodelist and ppn but no luck. I get errors
>> like:
>>
>>
>>
>> [c0301b10e1:03172] [[0,9999],1] -> [[0,0],0] (node: c0301b10e1)
>> oob-tcp: Number of attempts to create TCP connection has been
>> exceeded.  Can not communicate with peer
>> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file
>> grpcomm_hier_module.c at line 303
>> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file
>> base/grpcomm_base_modex.c at line 470
>> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file
>> grpcomm_hier_module.c at line 484
>> --------------------------------------------------------------------------
>> It looks like MPI_INIT failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or
>> environment
>> problems.  This failure appears to be an internal failure; here's some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>>
>>  orte_grpcomm_modex failed
>>  --> Returned "Unreachable" (-12) instead of "Success" (0)
>> --------------------------------------------------------------------------
>> *** The MPI_Init() function was called before MPI_INIT was invoked.
>> *** This is disallowed by the MPI standard.
>> *** Your MPI job will now abort.
>> [c0301b10e1:3172] Abort before MPI_INIT completed successfully; not
>> able to guarantee that all other processes were killed!
>>
>>
>> maybe a related question is how to assign the TCP port range and how
>> is it used? when the processes are on different machines, I use the
>> same range and that's ok as long as the range is free. but when the
>> processes are on the same node, what value should the range be for
>> each process? My range is 10000-12000 (for both processes) and I see
>> that process with rank #0 listen on port 10001 while process with rank
>> #1 try to establish a connect to port 10000.
>>
>> Thanks so much!
>> p. still here... still trying... ;-)
>>
>> On Tue, Jul 27, 2010 at 12:58 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>> > Use what hostname returns - don't worry about IP addresses as we'll
>> > discover them.
>> >
>> > On Jul 26, 2010, at 10:45 PM, Philippe wrote:
>> >
>> >> Thanks a lot!
>> >>
>> >> now, for the ev "OMPI_MCA_orte_nodes", what do I put exactly? our
>> >> nodes have a short/long name (it's rhel 5.x, so the command hostname
>> >> returns the long name) and at least 2 IP addresses.
>> >>
>> >> p.
>> >>
>> >> On Tue, Jul 27, 2010 at 12:06 AM, Ralph Castain <rhc_at_[hidden]>
>> >> wrote:
>> >>> Okay, fixed in r23499. Thanks again...
>> >>>
>> >>>
>> >>> On Jul 26, 2010, at 9:47 PM, Ralph Castain wrote:
>> >>>
>> >>>> Doh - yes it should! I'll fix it right now.
>> >>>>
>> >>>> Thanks!
>> >>>>
>> >>>> On Jul 26, 2010, at 9:28 PM, Philippe wrote:
>> >>>>
>> >>>>> Ralph,
>> >>>>>
>> >>>>> i was able to test the generic module and it seems to be working.
>> >>>>>
>> >>>>> one question tho, the function orte_ess_generic_component_query in
>> >>>>> "orte/mca/ess/generic/ess_generic_component.c" calls getenv with the
>> >>>>> argument "OMPI_MCA_enc", which seems to cause the module to fail to
>> >>>>> load. shouldnt it be "OMPI_MCA_ess" ?
>> >>>>>
>> >>>>> .....
>> >>>>>
>> >>>>>   /* only pick us if directed to do so */
>> >>>>>   if (NULL != (pick = getenv("OMPI_MCA_env")) &&
>> >>>>>                0 == strcmp(pick, "generic")) {
>> >>>>>       *priority = 1000;
>> >>>>>       *module = (mca_base_module_t *)&orte_ess_generic_module;
>> >>>>>
>> >>>>> ...
>> >>>>>
>> >>>>> p.
>> >>>>>
>> >>>>> On Thu, Jul 22, 2010 at 5:53 PM, Ralph Castain <rhc_at_[hidden]>
>> >>>>> wrote:
>> >>>>>> Dev trunk looks okay right now - I think you'll be fine using it.
>> >>>>>> My new component -might- work with 1.5, but probably not with 1.4. I haven't
>> >>>>>> checked either of them.
>> >>>>>>
>> >>>>>> Anything at r23478 or above will have the new module. Let me know
>> >>>>>> how it works for you. I haven't tested it myself, but am pretty sure it
>> >>>>>> should work.
>> >>>>>>
>> >>>>>>
>> >>>>>> On Jul 22, 2010, at 3:22 PM, Philippe wrote:
>> >>>>>>
>> >>>>>>> Ralph,
>> >>>>>>>
>> >>>>>>> Thank you so much!!
>> >>>>>>>
>> >>>>>>> I'll give it a try and let you know.
>> >>>>>>>
>> >>>>>>> I know it's a tough question, but how stable is the dev trunk? Can
>> >>>>>>> I
>> >>>>>>> just grab the latest and run, or am I better off taking your
>> >>>>>>> changes
>> >>>>>>> and copy them back in a stable release? (if so, which one? 1.4?
>> >>>>>>> 1.5?)
>> >>>>>>>
>> >>>>>>> p.
>> >>>>>>>
>> >>>>>>> On Thu, Jul 22, 2010 at 3:50 PM, Ralph Castain <rhc_at_[hidden]>
>> >>>>>>> wrote:
>> >>>>>>>> It was easier for me to just construct this module than to
>> >>>>>>>> explain how to do so :-)
>> >>>>>>>>
>> >>>>>>>> I will commit it this evening (couple of hours from now) as that
>> >>>>>>>> is our standard practice. You'll need to use the developer's trunk, though,
>> >>>>>>>> to use it.
>> >>>>>>>>
>> >>>>>>>> Here are the envars you'll need to provide:
>> >>>>>>>>
>> >>>>>>>> Each process needs to get the same following values:
>> >>>>>>>>
>> >>>>>>>> * OMPI_MCA_ess=generic
>> >>>>>>>> * OMPI_MCA_orte_num_procs=<number of MPI procs>
>> >>>>>>>> * OMPI_MCA_orte_nodes=<a comma-separated list of nodenames where
>> >>>>>>>> MPI procs reside>
>> >>>>>>>> * OMPI_MCA_orte_ppn=<number of procs/node>
>> >>>>>>>>
>> >>>>>>>> Note that I have assumed this last value is a constant for
>> >>>>>>>> simplicity. If that isn't the case, let me know - you could instead provide
>> >>>>>>>> it as a comma-separated list of values with an entry for each node.
>> >>>>>>>>
>> >>>>>>>> In addition, you need to provide the following value that will be
>> >>>>>>>> unique to each process:
>> >>>>>>>>
>> >>>>>>>> * OMPI_MCA_orte_rank=<MPI rank>
>> >>>>>>>>
>> >>>>>>>> Finally, you have to provide a range of static TCP ports for use
>> >>>>>>>> by the processes. Pick any range that you know will be available across all
>> >>>>>>>> the nodes. You then need to ensure that each process sees the following
>> >>>>>>>> envar:
>> >>>>>>>>
>> >>>>>>>> * OMPI_MCA_oob_tcp_static_ports=6000-6010  <== obviously, replace
>> >>>>>>>> this with your range
>> >>>>>>>>
>> >>>>>>>> You will need a port range that is at least equal to the ppn for
>> >>>>>>>> the job (each proc on a node will take one of the provided ports).
>> >>>>>>>>
>> >>>>>>>> That should do it. I compute everything else I need from those
>> >>>>>>>> values.
>> >>>>>>>>
>> >>>>>>>> Does that work for you?
>> >>>>>>>> Ralph
>> >>>>>>>>
>> >>>>>>>>
>> >>
>> >> _______________________________________________
>> >> users mailing list
>> >> users_at_[hidden]
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>