Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Pointers for understanding failure messages on NetBSD
From: Kevin.Buckley_at_[hidden]
Date: 2009-12-02 18:46:24


> I have actually already taken the IPv6 block and simply tried to
> replace any IPv6 stuff with IPv4 "equivalents", eg:

At the risk of showing a lot of ignorance, here's the block I coddled
together based on the IPv6 block.

I have tried to keep it looking as close to the original IPv6
block as possible.

Note the little printf near the end.

#if defined(__NetBSD__)
/* || defined(__OpenBSD__) || defined(__FreeBSD__) || \
             defined(__386BSD__) || defined(__bsdi__) ||
defined(__APPLE__) */
/* || defined(__linux__) */

    {
        struct ifaddrs **ifadd_list;
        struct ifaddrs *cur_ifaddrs;
        struct sockaddr_in* sin_addr;

        /*
         * the manpage claims that getifaddrs() allocates the memory,
         * and freeifaddrs() is later used to release the allocated memory.
         * however, without this malloc the call to getifaddrs() segfaults
         */
        ifadd_list = (struct ifaddrs **) malloc(sizeof(struct ifaddrs*));

        /* create the linked list of ifaddrs structs */
        if(getifaddrs(ifadd_list) < 0) {
            opal_output(0, "opal_ifinit: getifaddrs() failed with
error=%d\n",
                    errno);
            return OPAL_ERROR;
        }

        for(cur_ifaddrs = *ifadd_list; NULL != cur_ifaddrs;
                cur_ifaddrs = cur_ifaddrs->ifa_next) {

            opal_if_t intf;
            opal_if_t *intf_ptr;
            struct in_addr a4;

            /* skip non- af_inet interface addresses */
            if(AF_INET != cur_ifaddrs->ifa_addr->sa_family) {
#if 0
                printf("skipping non- af_inet interface %s.\n",
cur_ifaddrs->ifa_name);
#endif
                continue;
            }

            /* skip interface if it is down (IFF_UP not set) */
            if(0 == (cur_ifaddrs->ifa_flags & IFF_UP)) {
#if 0
                printf("skipping non-up interface %s.\n",
cur_ifaddrs->ifa_name);
#endif
                continue;
            }

            /* skip interface if it is a loopback device (IFF_LOOPBACK
set) */
            /* or if it is a point-to-point interface */
            /* TODO: do we really skip p2p? */
            if(0 != (cur_ifaddrs->ifa_flags & IFF_LOOPBACK)
                    || 0!= (cur_ifaddrs->ifa_flags & IFF_POINTOPOINT)) {
#if 0
                printf("skipping loopback interface %s.\n",
cur_ifaddrs->ifa_name);
#endif
                continue;
            }

            sin_addr = (struct sockaddr_in *) cur_ifaddrs->ifa_addr;

            /* There shouldn't be any IPv6 address starting with fe80: to
skip */

            memset(&intf, 0, sizeof(intf));
            OBJ_CONSTRUCT(&intf, opal_list_item_t);
#if 0
            char *addr_name = (char *) malloc(48*sizeof(char));
            inet_ntop(AF_INET, &in_addr->s_addr, addr_name, 48*sizeof(char));
            opal_output(0, "inet capable interface %s discovered, address
%s.\n",
                    cur_ifaddrs->ifa_name, addr_name);
            free(addr_name);
#endif

            /* fill values into the opal_if_t */
            memcpy(&a4, &(sin_addr->sin_addr), sizeof(struct in_addr));

            strncpy(intf.if_name, cur_ifaddrs->ifa_name, IF_NAMESIZE);
            intf.if_index = opal_list_get_size(&opal_if_list) + 1;
            ((struct sockaddr_in*) &intf.if_addr)->sin_addr = a4;
            ((struct sockaddr_in*) &intf.if_addr)->sin_family = AF_INET;

            /* since every scope != 0 is ignored, we just set the scope to
0 */
            /* There's no scope_id in the non-ipv6 stuff
            ((struct sockaddr_in6*) &intf.if_addr)->sin6_scope_id = 0;
            */

            /*
             * hardcoded netmask, adrian says that's ok
             */
            intf.if_mask = 64;
            intf.if_flags = cur_ifaddrs->ifa_flags;

            /*
             * FIXME: figure out how to gain access to the kernel index
             * (or create our own), getifaddrs() does not contain such
             * data
             */

            intf.if_kernel_index = (uint16_t)
if_nametoindex(cur_ifaddrs->ifa_name);

            intf_ptr = (opal_if_t*) calloc(1, sizeof(opal_if_t));
            if(NULL == intf_ptr) {
                opal_output(0, "opal_ifinit: unable to allocate %lu bytes\n",
                            sizeof(opal_if_t));
                OBJ_DESTRUCT(&intf);
                return OPAL_ERR_OUT_OF_RESOURCE;
            }
            memcpy(intf_ptr, &intf, sizeof(intf));

            printf("About to append interface %s.\n", cur_ifaddrs->ifa_name);

            opal_list_append(&opal_if_list, (opal_list_item_t*) intf_ptr);
            OBJ_DESTRUCT(&intf);
        } /* of for loop over ifaddrs list */

    }
#endif /* netbsd */

What I get when I try to do an mpirun -n 4 hello_f77 is now

About to append interface wm0.
[europa:27981] *** Process received signal ***
[europa:27981] Signal: Segmentation fault (11)
[europa:27981] Signal code: Address not mapped (1)
[europa:27981] Failing at address: 0x8
[europa:27981] *** End of error message ***
Segmentation fault

So I am not tripping up on the iotctls anymore !

Maybe that will shed some light for someone else.

-- 
Kevin M. Buckley                                  Room:  CO327
School of Engineering and                         Phone: +64 4 463 5971
 Computer Science
Victoria University of Wellington
New Zealand