Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Dennis McRitchie (dmcr_at_[hidden])
Date: 2007-02-03 14:41:45


Sorry. I just realized that you must mean the log file created by the
configure script. I was looking for a OpenMPI runtime log file. Is there
such a thing?

Dennis

> -----Original Message-----
> From: users-bounces_at_[hidden]
> [mailto:users-bounces_at_[hidden]] On Behalf Of Dennis McRitchie
> Sent: Friday, February 02, 2007 6:12 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Can't run simple job with openmpi
> using the Intelcompiler
>
> Also, I see mention in your FAQ about config.log. My openmpi does not
> appear to be generating it, at least not anywhere in the install tree.
> How can I enable the creation of the log file?
>
> Thanks,
> Dennis
>
> -----Original Message-----
> From: Dennis McRitchie
> Sent: Friday, February 02, 2007 6:08 PM
> To: 'Open MPI Users'
> Subject: Can't run simple job with openmpi using the Intel compiler
>
> When I submit a simple job (described below) using PBS, I
> always get one
> of the following two errors:
> 1) [adroit-28:03945] [0,0,1]-[0,0,0] mca_oob_tcp_peer_recv_blocking:
> recv() failed with errno=104
>
> 2) [adroit-30:03770] [0,0,3]-[0,0,0]
> mca_oob_tcp_peer_complete_connect:
> connection failed (errno=111) - retrying (pid=3770)
>
> The program does a uname and prints out results to standard out. The
> only MPI calls it makes are MPI_Init, MPI_Comm_size,
> MPI_Comm_rank, and
> MPI_Finalize. I have tried it with both openmpi v 1.1.2 and
> 1.1.4, built
> with Intel C compiler 9.1.045, and get the same results. But
> if I build
> the same versions of openmpi using gcc, the test program always works
> fine. The app itself is built with mpicc.
>
> It runs successfully if run from the command line with "mpiexec -n X
> <test-program-name>", where X is 1 to 8, but if I wrap it in the
> following qsub command file:
> ---------------------------------------------------
> #PBS -l pmem=512mb,nodes=1:ppn=1,walltime=0:10:00
> #PBS -m abe
> # #PBS -o /home0/dmcr/my_mpi/curt/uname_test.gcc.stdout
> # #PBS -e /home0/dmcr/my_mpi/curt/uname_test.gcc.stderr
>
> cd /home/dmcr/my_mpi/openmpi
> echo "About to call mpiexec"
> module list
> mpiexec -n 1 uname_test.intel
> echo "After call to mpiexec"
> ----------------------------------------------------
>
> it fails on any number of processors from 1 to 8, and the application
> segfaults.
>
> The complete standard error of an 8-processsor job follows (note that
> mpiexec ran on adroit-31, but usually there is no info about adroit-31
> in standard error):
> -------------------------
> Currently Loaded Modulefiles:
> 1) intel/9.1/32/C/9.1.045 4) intel/9.1/32/default
> 2) intel/9.1/32/Fortran/9.1.040 5) openmpi/intel/1.1.2/32
> 3) intel/9.1/32/Iidb/9.1.045
> Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
> Failing at addr:0x5
> [0] func:/usr/local/openmpi/1.1.4/intel/i386/lib/libopal.so.0
> [0xb72c5b]
> *** End of error message ***
> ^@[adroit-29:03934] [0,0,2]-[0,0,0] mca_oob_tcp_peer_recv_blocking:
> recv() failed with errno=104
> [adroit-28:03945] [0,0,1]-[0,0,0]
> mca_oob_tcp_peer_recv_blocking: recv()
> failed with errno=104
> [adroit-30:03770] [0,0,3]-[0,0,0] mca_oob_tcp_peer_complete_connect:
> connection failed (errno=111) - retrying (pid=3770)
> --------------------------
>
> The complete standard error of an 1-processsor job follows:
> --------------------------
> Currently Loaded Modulefiles:
> 1) intel/9.1/32/C/9.1.045 4) intel/9.1/32/default
> 2) intel/9.1/32/Fortran/9.1.040 5) openmpi/intel/1.1.2/32
> 3) intel/9.1/32/Iidb/9.1.045
> Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
> Failing at addr:0x2
> [0] func:/usr/local/openmpi/1.1.2/intel/i386/lib/libopal.so.0
> [0x27d847]
> *** End of error message ***
> ^@[adroit-31:08840] [0,0,1]-[0,0,0] mca_oob_tcp_peer_complete_connect:
> connection failed (errno=111) - retrying (pid=8840)
> ---------------------------
>
> Any thoughts as to why this might be failing?
>
> Thanks,
> Dennis
>
> Dennis McRitchie
> Computational Science and Engineering Support (CSES)
> Academic Services Department
> Office of Information Technology
> Princeton University
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>