Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Segfault in odls_fork_local_procs() for some values of npersocket
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-11-08 06:51:38


Looks fine to me - CMR filed. Thanks!

On Nov 8, 2011, at 1:01 AM, nadia.derbey wrote:

> Hi,
>
> In v1.5, when mpirun is called with both the "-bind-to-core" and
> "-npersocket" options, and the npersocket value leads to less procs than
> sockets allocated on one node, we get a segfault
>
> Testing environment:
> openmpi v1.5
> 2 nodes with 4 8-cores sockets each
> mpirun -n 10 -bind-to-core -npersocket 2
>
> I was expecting to get:
> . ranks 0-1 : node 0 - socket 0
> . ranks 2-3 : node 0 - socket 1
> . ranks 4-5 : node 0 - socket 2
> . ranks 6-7 : node 0 - socket 3
> . ranks 8-9 : node 1 - socket 0
>
> Instead of that, everything worked fine on node 0, and I got a segfault
> on node 1, with a stack that looks like:
>
> [derbeyn_at_berlin18 ~]$ mpirun --host berlin18,berlin26 -n 10
> -bind-to-core -npersocket 2 sleep 900
> [berlin26:21531] *** Process received signal ***
> [berlin26:21531] Signal: Floating point exception (8)
> [berlin26:21531] Signal code: Integer divide-by-zero (1)
> [berlin26:21531] Failing at address: 0x7fed13731d63
> [berlin26:21531] [ 0] /lib64/libpthread.so.0(+0xf490) [0x7fed15327490]
> [berlin26:21531]
> [ 1] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/openmpi/mca_odls_default.so(+0x2d63) [0x7fed13731d63]
> [berlin26:21531]
> [ 2] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/libopen-rte.so.3(orte_odls_base_default_launch_local+0xaf3) [0x7fed15e1fe73]
> [berlin26:21531]
> [ 3] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/openmpi/mca_odls_default.so(+0x1d10) [0x7fed13730d10]
> [berlin26:21531]
> [ 4] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/libopen-rte.so.3(+0x3804d)
> [0x7fed15e1004d]
> [berlin26:21531]
> [ 5] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/libopen-rte.so.3(orte_daemon_cmd_processor+0x4aa) [0x7fed15e1209a]
> [berlin26:21531]
> [ 6] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/libopen-rte.so.3(+0x74ee8)
> [0x7fed15e4cee8]
> [berlin26:21531]
> [ 7] /home_nfs/derbeyn/DISTS/openmpi-v1.5/lib/libopen-rte.so.3(orte_daemon+0x8d8) [0x7fed15e0f268]
> [berlin26:21531] [ 8] /home_nfs/derbeyn/DISTS/openmpi-v1.5/bin/orted()
> [0x4008c6]
> [berlin26:21531] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd)
> [0x7fed14fa7c9d]
> [berlin26:21531] [10] /home_nfs/derbeyn/DISTS/openmpi-v1.5/bin/orted()
> [0x400799]
> [berlin26:21531] *** End of error message ***
>
> The reason for this issue is that the npersocket value is taken into
> account during the very first phase of mpirun (rmaps/load_balance) to
> claim the slots on each node:
> npersocket() (in rmaps/load_balance/rmaps_lb.c) claims
> . 8 slots on node 0 (4 sockets * 2 persocket)
> . 2 slots on node 1 (10 total ranks - 8 already claimed)
>
> But when we come to odls_default_fork_local_proc() (in
> odls/default/odls_default_module.c) npersocket is actually recomputed.
> Everything works fine on node 0. But on node 1, we have:
> . jobdat->policy has both ORTE_BIND_TO_CORE and ORTE_MAPPING_NPERXXX
> . npersocket is recomputed the following way:
> npersocket = jobdat->num_local_procs/orte_odls_globals.num_sockets
> = 2 / 4 = 0
> . later on, when the starting point is computed:
> logical_cpu = (lrank % npersocket) * jobdat->cpus_per_rank;
> we get the divide-by-zero exception.
>
> The problem comes, in my mind, from the fact we are recomputing the
> npersocket on the local nodes instead of storing it in the jobdat
> structure (as it is done today for the policy, the cpus_per_rank, the
> stride,...).
> Recomputing this value leads either to the segfault I got, or even to
> wrong mappings: if we had had 4 slots claimed on node 1, the result
> would have been 1 rank per socket (since we have 4-sockets nodes)
> instead of 2 ranks on the first 2 sockets.
>
> The attached patch is a fix proposal implementing my suggestion of
> storing the npersocket into the jobdat.
>
> This patch applies on v1.5. Waiting for your comments...
>
> Regards,
> Nadia
>
> --
> Nadia Derbey
> <001_dont_recompute_npersocket_on_local_nodes.patch>_______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel