Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646
From: tmishima_at_[hidden]
Date: 2013-11-13 19:43:41


Yes, the node08 has 8 slots but the process I run is also 8.

 #PBS -l nodes=node08:ppn=8

Therefore, I think it should allow this allocation. Is that right?

My question is why scritp1 works and script2 does not. They are
almost same.

#PBS -l nodes=node08:ppn=8
export OMP_NUM_THREADS=1
cd $PBS_O_WORKDIR
cp $PBS_NODEFILE pbs_hosts
NPROCS=`wc -l < pbs_hosts`

#SCRITP1
mpirun -report-bindings -bind-to core Myprog

#SCRIPT2
mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings -bind-to core
Myprog

tmishima

> I guess here's my confusion. If you are using only one node, and that
node has 8 allocated slots, then we will not allow you to run more than 8
processes on that node unless you specifically provide
> the --oversubscribe flag. This is because you are operating in a managed
environment (in this case, under Torque), and so we treat the allocation as
"mandatory" by default.
>
> I suspect that is the issue here, in which case the system is behaving as
it should.
>
> Is the above accurate?
>
>
> On Nov 13, 2013, at 4:11 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>
> > It has nothing to do with LAMA as you aren't using that mapper.
> >
> > How many nodes are in this allocation?
> >
> > On Nov 13, 2013, at 4:06 PM, tmishima_at_[hidden] wrote:
> >
> >>
> >>
> >> Hi Ralph, this is an additional information.
> >>
> >> Here is the main part of output by adding "-mca rmaps_base_verbose
50".
> >>
> >> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm
> >> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm creating map
> >> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm only HNP in
> >> allocation
> >> [node08.cluster:26952] mca:rmaps: mapping job [56581,1]
> >> [node08.cluster:26952] mca:rmaps: creating new map for job [56581,1]
> >> [node08.cluster:26952] mca:rmaps:ppr: job [56581,1] not using ppr
mapper
> >> [node08.cluster:26952] [[56581,0],0] rmaps:seq mapping job [56581,1]
> >> [node08.cluster:26952] mca:rmaps:seq: job [56581,1] not using seq
mapper
> >> [node08.cluster:26952] mca:rmaps:resilient: cannot perform initial map
of
> >> job [56581,1] - no fault groups
> >> [node08.cluster:26952] mca:rmaps:mindist: job [56581,1] not using
mindist
> >> mapper
> >> [node08.cluster:26952] mca:rmaps:rr: mapping job [56581,1]
> >> [node08.cluster:26952] [[56581,0],0] Starting with 1 nodes in list
> >> [node08.cluster:26952] [[56581,0],0] Filtering thru apps
> >> [node08.cluster:26952] [[56581,0],0] Retained 1 nodes in list
> >> [node08.cluster:26952] [[56581,0],0] Removing node node08 slots 0
inuse 0
> >>
> >> From this result, I guess it's related to oversubscribe.
> >> So I added "-oversubscribe" and rerun, then it worked well as show
below:
> >>
> >> [node08.cluster:27019] [[56774,0],0] Starting with 1 nodes in list
> >> [node08.cluster:27019] [[56774,0],0] Filtering thru apps
> >> [node08.cluster:27019] [[56774,0],0] Retained 1 nodes in list
> >> [node08.cluster:27019] AVAILABLE NODES FOR MAPPING:
> >> [node08.cluster:27019] node: node08 daemon: 0
> >> [node08.cluster:27019] [[56774,0],0] Starting bookmark at node node08
> >> [node08.cluster:27019] [[56774,0],0] Starting at node node08
> >> [node08.cluster:27019] mca:rmaps:rr: mapping by slot for job [56774,1]
> >> slots 1 num_procs 8
> >> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
> >> [node08.cluster:27019] mca:rmaps:rr:slot node node08 is full -
skipping
> >> [node08.cluster:27019] mca:rmaps:rr:slot job [56774,1] is
oversubscribed -
> >> performing second pass
> >> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
> >> [node08.cluster:27019] mca:rmaps:rr:slot adding up to 8 procs to node
> >> node08
> >> [node08.cluster:27019] mca:rmaps:base: computing vpids by slot for job
> >> [56774,1]
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 0 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 1 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 2 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 3 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 4 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 5 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 6 to node node08
> >> [node08.cluster:27019] mca:rmaps:base: assigning rank 7 to node node08
> >>
> >> I think something is wrong with treatment of oversubscription, which
might
> >> be
> >> related to "#3893: LAMA mapper has problems"
> >>
> >> tmishima
> >>
> >>> Hmmm...looks like we aren't getting your allocation. Can you rerun
and
> >> add -mca ras_base_verbose 50?
> >>>
> >>> On Nov 12, 2013, at 11:30 PM, tmishima_at_[hidden] wrote:
> >>>
> >>>>
> >>>>
> >>>> Hi Ralph,
> >>>>
> >>>> Here is the output of "-mca plm_base_verbose 5".
> >>>>
> >>>> [node08.cluster:23573] mca:base:select:( plm) Querying component
[rsh]
> >>>> [node08.cluster:23573] [[INVALID],INVALID] plm:rsh_lookup on
> >>>> agent /usr/bin/rsh path NULL
> >>>> [node08.cluster:23573] mca:base:select:( plm) Query of component
[rsh]
> >> set
> >>>> priority to 10
> >>>> [node08.cluster:23573] mca:base:select:( plm) Querying component
> >> [slurm]
> >>>> [node08.cluster:23573] mca:base:select:( plm) Skipping component
> >> [slurm].
> >>>> Query failed to return a module
> >>>> [node08.cluster:23573] mca:base:select:( plm) Querying component
[tm]
> >>>> [node08.cluster:23573] mca:base:select:( plm) Query of component
[tm]
> >> set
> >>>> priority to 75
> >>>> [node08.cluster:23573] mca:base:select:( plm) Selected component
[tm]
> >>>> [node08.cluster:23573] plm:base:set_hnp_name: initial bias 23573
> >> nodename
> >>>> hash 85176670
> >>>> [node08.cluster:23573] plm:base:set_hnp_name: final jobfam 59480
> >>>> [node08.cluster:23573] [[59480,0],0] plm:base:receive start comm
> >>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_job
> >>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm
> >>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm creating map
> >>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm only HNP in
> >>>> allocation
> >>>>
> >>
--------------------------------------------------------------------------
> >>>> All nodes which are allocated for this job are already filled.
> >>>>
> >>
--------------------------------------------------------------------------
> >>>>
> >>>> Here, openmpi's configuration is as follows:
> >>>>
> >>>> ./configure \
> >>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7.4a1-pgi13.10 \
> >>>> --with-tm \
> >>>> --with-verbs \
> >>>> --disable-ipv6 \
> >>>> --disable-vt \
> >>>> --enable-debug \
> >>>> CC=pgcc CFLAGS="-tp k8-64e" \
> >>>> CXX=pgCC CXXFLAGS="-tp k8-64e" \
> >>>> F77=pgfortran FFLAGS="-tp k8-64e" \
> >>>> FC=pgfortran FCFLAGS="-tp k8-64e"
> >>>>
> >>>>> Hi Ralph,
> >>>>>
> >>>>> Okey, I can help you. Please give me some time to report the
output.
> >>>>>
> >>>>> Tetsuya Mishima
> >>>>>
> >>>>>> I can try, but I have no way of testing Torque any more - so all I
> >> can
> >>>> do
> >>>>> is a code review. If you can build --enable-debug and add -mca
> >>>>> plm_base_verbose 5 to your cmd line, I'd appreciate seeing the
> >>>>>> output.
> >>>>>>
> >>>>>>
> >>>>>> On Nov 12, 2013, at 9:58 PM, tmishima_at_[hidden] wrote:
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Hi Ralph,
> >>>>>>>
> >>>>>>> Thank you for your quick response.
> >>>>>>>
> >>>>>>> I'd like to report one more regressive issue about Torque support
of
> >>>>>>> openmpi-1.7.4a1r29646, which might be related to "#3893: LAMA
mapper
> >>>>>>> has problems" I reported a few days ago.
> >>>>>>>
> >>>>>>> The script below does not work with openmpi-1.7.4a1r29646,
> >>>>>>> although it worked with openmpi-1.7.3 as I told you before.
> >>>>>>>
> >>>>>>> #!/bin/sh
> >>>>>>> #PBS -l nodes=node08:ppn=8
> >>>>>>> export OMP_NUM_THREADS=1
> >>>>>>> cd $PBS_O_WORKDIR
> >>>>>>> cp $PBS_NODEFILE pbs_hosts
> >>>>>>> NPROCS=`wc -l < pbs_hosts`
> >>>>>>> mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings
> >> -bind-to
> >>>>> core
> >>>>>>> Myprog
> >>>>>>>
> >>>>>>> If I drop "-machinefile pbs_hosts -np ${NPROCS} ", then it works
> >>>> fine.
> >>>>>>> Since this happens without lama request, I guess it's not the
> >> problem
> >>>>>>> in lama itself. Anyway, please look into this issue as well.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Tetsuya Mishima
> >>>>>>>
> >>>>>>>> Done - thanks!
> >>>>>>>>
> >>>>>>>> On Nov 12, 2013, at 7:35 PM, tmishima_at_[hidden] wrote:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Dear openmpi developers,
> >>>>>>>>>
> >>>>>>>>> I got a segmentation fault in traial use of
openmpi-1.7.4a1r29646
> >>>>> built
> >>>>>>> by
> >>>>>>>>> PGI13.10 as shown below:
> >>>>>>>>>
> >>>>>>>>> [mishima_at_manage testbed-openmpi-1.7.3]$ mpirun -np 4
> >> -cpus-per-proc
> >>>> 2
> >>>>>>>>> -report-bindings mPre
> >>>>>>>>> [manage.cluster:23082] MCW rank 2 bound to socket 0[core 4[hwt
> >> 0]],
> >>>>>>> socket
> >>>>>>>>> 0[core 5[hwt 0]]: [././././B/B][./././././.]
> >>>>>>>>> [manage.cluster:23082] MCW rank 3 bound to socket 1[core 6[hwt
> >> 0]],
> >>>>>>> socket
> >>>>>>>>> 1[core 7[hwt 0]]: [./././././.][B/B/./././.]
> >>>>>>>>> [manage.cluster:23082] MCW rank 0 bound to socket 0[core 0[hwt
> >> 0]],
> >>>>>>> socket
> >>>>>>>>> 0[core 1[hwt 0]]: [B/B/./././.][./././././.]
> >>>>>>>>> [manage.cluster:23082] MCW rank 1 bound to socket 0[core 2[hwt
> >> 0]],
> >>>>>>> socket
> >>>>>>>>> 0[core 3[hwt 0]]: [././B/B/./.][./././././.]
> >>>>>>>>> [manage:23082] *** Process received signal ***
> >>>>>>>>> [manage:23082] Signal: Segmentation fault (11)
> >>>>>>>>> [manage:23082] Signal code: Address not mapped (1)
> >>>>>>>>> [manage:23082] Failing at address: 0x34
> >>>>>>>>> [manage:23082] *** End of error message ***
> >>>>>>>>> Segmentation fault (core dumped)
> >>>>>>>>>
> >>>>>>>>> [mishima_at_manage testbed-openmpi-1.7.3]$ gdb mpirun core.23082
> >>>>>>>>> GNU gdb (GDB) CentOS (7.0.1-42.el5.centos.1)
> >>>>>>>>> Copyright (C) 2009 Free Software Foundation, Inc.
> >>>>>>>>> ...
> >>>>>>>>> Core was generated by `mpirun -np 4 -cpus-per-proc 2
> >>>> -report-bindings
> >>>>>>>>> mPre'.
> >>>>>>>>> Program terminated with signal 11, Segmentation fault.
> >>>>>>>>> #0 0x00002b5f861c9c4f in recv_connect (mod=0x5f861ca20b00007f,
> >>>>>>> sd=32767,
> >>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
> >>>>>>>>> 631 peer = OBJ_NEW(mca_oob_tcp_peer_t);
> >>>>>>>>> (gdb) where
> >>>>>>>>> #0 0x00002b5f861c9c4f in recv_connect (mod=0x5f861ca20b00007f,
> >>>>>>> sd=32767,
> >>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
> >>>>>>>>> #1 0x00002b5f861ca20b in recv_handler (sd=1778385023,
> >> flags=32767,
> >>>>>>>>> cbdata=0x8eb06a00007fff25) at ./oob_tcp.c:760
> >>>>>>>>> #2 0x00002b5f848eb06a in event_process_active_single_queue
> >>>>>>>>> (base=0x5f848eb27000007f, activeq=0x848eb27000007fff)
> >>>>>>>>> at ./event.c:1366
> >>>>>>>>> #3 0x00002b5f848eb270 in event_process_active
> >>>>>>> (base=0x5f848eb84900007f)
> >>>>>>>>> at ./event.c:1435
> >>>>>>>>> #4 0x00002b5f848eb849 in opal_libevent2021_event_base_loop
> >>>>>>>>> (base=0x4077a000007f, flags=32767) at ./event.c:1645
> >>>>>>>>> #5 0x00000000004077a0 in orterun (argc=7, argv=0x7fff25bbd4a8)
> >>>>>>>>> at ./orterun.c:1030
> >>>>>>>>> #6 0x00000000004067fb in main (argc=7, argv=0x7fff25bbd4a8)
> >>>>>>> at ./main.c:13
> >>>>>>>>> (gdb) quit
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The line 627 in orte/mca/oob/tcp/oob_tcp.c is apparently
> >>>> unnecessary,
> >>>>>>> which
> >>>>>>>>> causes the segfault.
> >>>>>>>>>
> >>>>>>>>> 624 /* lookup the corresponding process */
> >>>>>>>>> 625 peer = mca_oob_tcp_peer_lookup(mod, &hdr->origin);
> >>>>>>>>> 626 if (NULL == peer) {
> >>>>>>>>> 627 ui64 = (uint64_t*)(&peer->name);
> >>>>>>>>> 628 opal_output_verbose(OOB_TCP_DEBUG_CONNECT,
> >>>>>>>>> orte_oob_base_framework.framework_output,
> >>>>>>>>> 629 "%s mca_oob_tcp_recv_connect:
> >>>>>>>>> connection from new peer",
> >>>>>>>>> 630 ORTE_NAME_PRINT
> >>>>> (ORTE_PROC_MY_NAME));
> >>>>>>>>> 631 peer = OBJ_NEW(mca_oob_tcp_peer_t);
> >>>>>>>>> 632 peer->mod = mod;
> >>>>>>>>> 633 peer->name = hdr->origin;
> >>>>>>>>> 634 peer->state = MCA_OOB_TCP_ACCEPTING;
> >>>>>>>>> 635 ui64 = (uint64_t*)(&peer->name);
> >>>>>>>>> 636 if (OPAL_SUCCESS !=
opal_hash_table_set_value_uint64
> >>>>>>> (&mod->
> >>>>>>>>> peers, (*ui64), peer)) {
> >>>>>>>>> 637 OBJ_RELEASE(peer);
> >>>>>>>>> 638 return;
> >>>>>>>>> 639 }
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Please fix this mistake in the next release.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Tetsuya Mishima
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> users mailing list
> >>>>>>>>> users_at_[hidden]
> >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> users mailing list
> >>>>>>>> users_at_[hidden]
> >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> users mailing list
> >>>>>>> users_at_[hidden]
> >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> users_at_[hidden]
> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users