Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI process dies with a route error when using dynamic process calls to connect more than 2 clients to a server with InfiniBand
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-08-19 14:02:55


Something doesn't look right - here is what the algo attempts to do:

given a port range of 10000-12000, the lowest rank'd process on the node
should open port 10000. The next lowest rank on the node will open 10001,
etc.

So it looks to me like there is some confusion in the local rank algo. I'll
have to look at the generic module - must be a bug in it somewhere.

This might take a couple of days as I have surgery tomorrow morning, so
please forgive the delay.

On Thu, Aug 19, 2010 at 11:13 AM, Philippe <philmpi_at_[hidden]> wrote:

> Ralph,
>
> I'm able to use the generic module when the processes are on different
> machines.
>
> what would be the values of the EV when two processes are on the same
> machine (hopefully talking over SHM).
>
> i've played with combination of nodelist and ppn but no luck. I get errors
> like:
>
>
>
> [c0301b10e1:03172] [[0,9999],1] -> [[0,0],0] (node: c0301b10e1)
> oob-tcp: Number of attempts to create TCP connection has been
> exceeded. Can not communicate with peer
> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file
> grpcomm_hier_module.c at line 303
> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file
> base/grpcomm_base_modex.c at line 470
> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file
> grpcomm_hier_module.c at line 484
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> orte_grpcomm_modex failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> [c0301b10e1:3172] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
>
>
> maybe a related question is how to assign the TCP port range and how
> is it used? when the processes are on different machines, I use the
> same range and that's ok as long as the range is free. but when the
> processes are on the same node, what value should the range be for
> each process? My range is 10000-12000 (for both processes) and I see
> that process with rank #0 listen on port 10001 while process with rank
> #1 try to establish a connect to port 10000.
>
> Thanks so much!
> p. still here... still trying... ;-)
>
> On Tue, Jul 27, 2010 at 12:58 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> > Use what hostname returns - don't worry about IP addresses as we'll
> discover them.
> >
> > On Jul 26, 2010, at 10:45 PM, Philippe wrote:
> >
> >> Thanks a lot!
> >>
> >> now, for the ev "OMPI_MCA_orte_nodes", what do I put exactly? our
> >> nodes have a short/long name (it's rhel 5.x, so the command hostname
> >> returns the long name) and at least 2 IP addresses.
> >>
> >> p.
> >>
> >> On Tue, Jul 27, 2010 at 12:06 AM, Ralph Castain <rhc_at_[hidden]>
> wrote:
> >>> Okay, fixed in r23499. Thanks again...
> >>>
> >>>
> >>> On Jul 26, 2010, at 9:47 PM, Ralph Castain wrote:
> >>>
> >>>> Doh - yes it should! I'll fix it right now.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> On Jul 26, 2010, at 9:28 PM, Philippe wrote:
> >>>>
> >>>>> Ralph,
> >>>>>
> >>>>> i was able to test the generic module and it seems to be working.
> >>>>>
> >>>>> one question tho, the function orte_ess_generic_component_query in
> >>>>> "orte/mca/ess/generic/ess_generic_component.c" calls getenv with the
> >>>>> argument "OMPI_MCA_enc", which seems to cause the module to fail to
> >>>>> load. shouldnt it be "OMPI_MCA_ess" ?
> >>>>>
> >>>>> .....
> >>>>>
> >>>>> /* only pick us if directed to do so */
> >>>>> if (NULL != (pick = getenv("OMPI_MCA_env")) &&
> >>>>> 0 == strcmp(pick, "generic")) {
> >>>>> *priority = 1000;
> >>>>> *module = (mca_base_module_t *)&orte_ess_generic_module;
> >>>>>
> >>>>> ...
> >>>>>
> >>>>> p.
> >>>>>
> >>>>> On Thu, Jul 22, 2010 at 5:53 PM, Ralph Castain <rhc_at_[hidden]>
> wrote:
> >>>>>> Dev trunk looks okay right now - I think you'll be fine using it. My
> new component -might- work with 1.5, but probably not with 1.4. I haven't
> checked either of them.
> >>>>>>
> >>>>>> Anything at r23478 or above will have the new module. Let me know
> how it works for you. I haven't tested it myself, but am pretty sure it
> should work.
> >>>>>>
> >>>>>>
> >>>>>> On Jul 22, 2010, at 3:22 PM, Philippe wrote:
> >>>>>>
> >>>>>>> Ralph,
> >>>>>>>
> >>>>>>> Thank you so much!!
> >>>>>>>
> >>>>>>> I'll give it a try and let you know.
> >>>>>>>
> >>>>>>> I know it's a tough question, but how stable is the dev trunk? Can
> I
> >>>>>>> just grab the latest and run, or am I better off taking your
> changes
> >>>>>>> and copy them back in a stable release? (if so, which one? 1.4?
> 1.5?)
> >>>>>>>
> >>>>>>> p.
> >>>>>>>
> >>>>>>> On Thu, Jul 22, 2010 at 3:50 PM, Ralph Castain <rhc_at_[hidden]>
> wrote:
> >>>>>>>> It was easier for me to just construct this module than to explain
> how to do so :-)
> >>>>>>>>
> >>>>>>>> I will commit it this evening (couple of hours from now) as that
> is our standard practice. You'll need to use the developer's trunk, though,
> to use it.
> >>>>>>>>
> >>>>>>>> Here are the envars you'll need to provide:
> >>>>>>>>
> >>>>>>>> Each process needs to get the same following values:
> >>>>>>>>
> >>>>>>>> * OMPI_MCA_ess=generic
> >>>>>>>> * OMPI_MCA_orte_num_procs=<number of MPI procs>
> >>>>>>>> * OMPI_MCA_orte_nodes=<a comma-separated list of nodenames where
> MPI procs reside>
> >>>>>>>> * OMPI_MCA_orte_ppn=<number of procs/node>
> >>>>>>>>
> >>>>>>>> Note that I have assumed this last value is a constant for
> simplicity. If that isn't the case, let me know - you could instead provide
> it as a comma-separated list of values with an entry for each node.
> >>>>>>>>
> >>>>>>>> In addition, you need to provide the following value that will be
> unique to each process:
> >>>>>>>>
> >>>>>>>> * OMPI_MCA_orte_rank=<MPI rank>
> >>>>>>>>
> >>>>>>>> Finally, you have to provide a range of static TCP ports for use
> by the processes. Pick any range that you know will be available across all
> the nodes. You then need to ensure that each process sees the following
> envar:
> >>>>>>>>
> >>>>>>>> * OMPI_MCA_oob_tcp_static_ports=6000-6010 <== obviously, replace
> this with your range
> >>>>>>>>
> >>>>>>>> You will need a port range that is at least equal to the ppn for
> the job (each proc on a node will take one of the provided ports).
> >>>>>>>>
> >>>>>>>> That should do it. I compute everything else I need from those
> values.
> >>>>>>>>
> >>>>>>>> Does that work for you?
> >>>>>>>> Ralph
> >>>>>>>>
> >>>>>>>>
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>