Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r21548
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-07-01 16:06:33


Hmmm...I'll take a look. It seems to be working for me under Torque and
SLURM, though I cannot vouch for the tree launch. The problem with letting
the index start at 0 is it breaks other things, so I'll have to see about
fixing the routing schemes, or find some compromise.

Thanks for the heads up.
Ralph

On Wed, Jul 1, 2009 at 1:49 PM, George Bosilca <bosilca_at_[hidden]> wrote:

> Ralph,
>
> This commit break several components in OMPI, mainly the routing schemes
> and the tree launch. The part with the problem is the reduction of the
> number of declared daemons on the second part of the commit, where you
> change the boundary for the for loop from 0 to 1. As a result the number of
> daemons was decreased by one (I guess in order to exclude the HNP), which is
> not something that the routing implementations tolerate.
>
> Setting the loop boundary back to 0 seems to fix all problems. Please
> reconsider your patch.
>
> george.
>
> On Fri, 26 Jun 2009, rhc_at_[hidden] wrote:
>
> Author: rhc
>> Date: 2009-06-26 18:07:25 EDT (Fri, 26 Jun 2009)
>> New Revision: 21548
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/21548
>>
>> Log:
>> Cleanup some indexing bugs so that shared memory can function
>>
>> Text files modified:
>> trunk/orte/util/nidmap.c | 12 +++++++-----
>> 1 files changed, 7 insertions(+), 5 deletions(-)
>>
>> Modified: trunk/orte/util/nidmap.c
>>
>> ==============================================================================
>> --- trunk/orte/util/nidmap.c (original)
>> +++ trunk/orte/util/nidmap.c 2009-06-26 18:07:25 EDT (Fri, 26 Jun 2009)
>> @@ -341,10 +341,10 @@
>>
>> /* pack every nodename individually */
>> for (i=1; i < orte_node_pool->size; i++) {
>> + if (NULL == (node =
>> (orte_node_t*)opal_pointer_array_get_item(orte_node_pool, i))) {
>> + continue;
>> + }
>> if (!orte_keep_fqdn_hostnames) {
>> - if (NULL == (node =
>> (orte_node_t*)opal_pointer_array_get_item(orte_node_pool, i))) {
>> - continue;
>> - }
>> nodename = strdup(node->name);
>> if (NULL != (ptr = strchr(nodename, '.'))) {
>> *ptr = '\0';
>> @@ -553,6 +553,8 @@
>> ORTE_ERROR_LOG(rc);
>> return rc;
>> }
>> + /* set the daemon to 0 */
>> + node->daemon = 0;
>>
>> /* loop over nodes and unpack the raw nodename */
>> for (i=1; i < num_nodes; i++) {
>> @@ -570,7 +572,7 @@
>> }
>> }
>>
>> - /* unpack the daemon names */
>> + /* unpack the daemon vpids */
>> vpids = (orte_vpid_t*)malloc(num_nodes * sizeof(orte_vpid_t));
>> n=num_nodes;
>> if (ORTE_SUCCESS != (rc = opal_dss.unpack(&buf, vpids, &n, ORTE_VPID)))
>> {
>> @@ -581,7 +583,7 @@
>> * daemons in the system
>> */
>> num_daemons = 0;
>> - for (i=0; i < num_nodes; i++) {
>> + for (i=1; i < num_nodes; i++) {
>> if (NULL != (ndptr =
>> (orte_nid_t*)opal_pointer_array_get_item(&orte_nidmap, i))) {
>> ndptr->daemon = vpids[i];
>> if (ORTE_VPID_INVALID != vpids[i]) {
>> _______________________________________________
>> svn mailing list
>> svn_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/svn
>>
>>
> "We must accept finite disappointment, but we must never lose infinite
> hope."
> Martin Luther King
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>