Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpi_init waits 64 seconds if vpn is connected
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-03-22 00:25:45

The process is hanging trying to open a TCP connection back to mpirun. I would have thought that excluding the vpn interface would help, but it could be that there is still some interference from the vpn software itself - as you probably know, vpn generally tries to restrict connections.

I don't recall seeing this behavior with my laptop (which also runs with a Cisco vpn), but I'll check it again in the morning and let you know.

On Mar 21, 2013, at 6:52 PM, David A. Boger <dab143_at_[hidden]> wrote:

> I am having a problem on my linux desktop where mpi_init hangs for approximately 64 seconds if I have my vpn client connected but runs immediately if I disconnect the vpn. I've picked through the FAQ and Google but have failed to come up with a solution.
> Some potentially relevant information: I am using Open MPI 1.4.3 under ubuntu 12.04.1 and Cisco AnyConnect VPN Client. (I have also downloaded openmpi 1.6.4 and built it from source but believe it behaves the same way.)
> Some potentially irrelevant information: I believe SSH tunneling is disabled by the vpn. While the vpn is connected, ifconfig shows an extra interface (cscotun0 with inet addr: that shows up in the contact.txt file:
> wt217:~/wrk/mpi> cat /tmp/openmpi-sessions-dab143_at_wt217_0/29142/contact.txt
> 1909850112.0;tcp://;tcp://
> 22001
> The code is simply
> #include <stdio.h>
> #include <mpi.h>
> int main(int argc, char** argv)
> {
> MPI_Init(&argc, &argv);
> MPI_Finalize();
> return 0;
> }
> I compile it using "mpicc -g mpi_hello.c -o mpi_hello" and execute it using "mpirun -d -v ./mpi_hello". (The problem occurs whether or not I asked for more than one processor.) With verbosity on, I get the following output:
> wt217:~/wrk/mpi> mpirun -d -v ./mpi_hello
> [wt217:22015] procdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/0/0
> [wt217:22015] jobdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/0
> [wt217:22015] top: openmpi-sessions-dab143_at_wt217_0
> [wt217:22015] tmp: /tmp
> [wt217:22015] [[29144,0],0] node[0].name wt217 daemon 0 arch ffc91200
> [wt217:22015] Info: ! Setting up debugger process table for applications
> MPIR_being_debugged = 0
> MPIR_debug_state = 1
> MPIR_partial_attach_ok = 1
> MPIR_i_am_starter = 0
> MPIR_proctable_size = 1
> MPIR_proctable:
> (i, host, exe, pid) = (0, wt217, /home/dab143/wrk/mpi/./mpi_hello, 22016)
> [wt217:22016] procdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/1/0
> [wt217:22016] jobdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/1
> [wt217:22016] top: openmpi-sessions-dab143_at_wt217_0
> [wt217:22016] tmp: /tmp
> <hangs for approximately 64 seconds>
> [wt217:22016] [[29144,1],0] node[0].name wt217 daemon 0 arch ffc91200
> [wt217:22016] sess_dir_finalize: proc session dir not empty - leaving
> [wt217:22015] sess_dir_finalize: proc session dir not empty - leaving
> [wt217:22015] sess_dir_finalize: job session dir not empty - leaving
> [wt217:22015] sess_dir_finalize: proc session dir not empty - leaving
> orterun: e! xiting with status 0
> The code hangs for approximately 6! 4 second s after the line that reads "tmp: /tmp".
> If I attach gdb to the process during this time, the stack trace (attached) shows that the pause is in __GI___poll in /sysdeps/unix/sysv/linux/poll.c:83.
> If I add "-mca oob_tcp_if_exclude cscotun0", then the corresponding address for that vpn interface no longer shows up in contact.txt, but the problem remains. I also add "-mca btl ^cscotun0 -mca btl_tcp_if_exclude cscotun0" with no effect.
> Any idea what is hanging this up or how I can get more information as to what is going on during the pause? I assume connecting the vpn has caused mpi_init to look for something that isn't available and that eventually times out, but I don't know what.
> Output from ompi_info and the gdb stack trace is attached.
> Thanks,
> David
> <stack.txt.bz2><ompi_info.txt.bz2>_______________________________________________
> users mailing list
> users_at_[hidden]