Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] mpi_init waits 64 seconds if vpn is connected
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-03-22 00:25:45


The process is hanging trying to open a TCP connection back to mpirun. I would have thought that excluding the vpn interface would help, but it could be that there is still some interference from the vpn software itself - as you probably know, vpn generally tries to restrict connections.

I don't recall seeing this behavior with my laptop (which also runs with a Cisco vpn), but I'll check it again in the morning and let you know.

On Mar 21, 2013, at 6:52 PM, David A. Boger <dab143_at_[hidden]> wrote:

> I am having a problem on my linux desktop where mpi_init hangs for approximately 64 seconds if I have my vpn client connected but runs immediately if I disconnect the vpn. I've picked through the FAQ and Google but have failed to come up with a solution.
>
> Some potentially relevant information: I am using Open MPI 1.4.3 under ubuntu 12.04.1 and Cisco AnyConnect VPN Client. (I have also downloaded openmpi 1.6.4 and built it from source but believe it behaves the same way.)
>
> Some potentially irrelevant information: I believe SSH tunneling is disabled by the vpn. While the vpn is connected, ifconfig shows an extra interface (cscotun0 with inet addr:10.248.17.27 that shows up in the contact.txt file:
>
> wt217:~/wrk/mpi> cat /tmp/openmpi-sessions-dab143_at_wt217_0/29142/contact.txt
> 1909850112.0;tcp://192.168.1.3:48237;tcp://10.248.17.27:48237
> 22001
>
> The code is simply
>
> #include <stdio.h>
> #include <mpi.h>
>
> int main(int argc, char** argv)
> {
> MPI_Init(&argc, &argv);
> MPI_Finalize();
> return 0;
> }
>
> I compile it using "mpicc -g mpi_hello.c -o mpi_hello" and execute it using "mpirun -d -v ./mpi_hello". (The problem occurs whether or not I asked for more than one processor.) With verbosity on, I get the following output:
>
> wt217:~/wrk/mpi> mpirun -d -v ./mpi_hello
> [wt217:22015] procdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/0/0
> [wt217:22015] jobdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/0
> [wt217:22015] top: openmpi-sessions-dab143_at_wt217_0
> [wt217:22015] tmp: /tmp
> [wt217:22015] [[29144,0],0] node[0].name wt217 daemon 0 arch ffc91200
> [wt217:22015] Info: ! Setting up debugger process table for applications
> MPIR_being_debugged = 0
> MPIR_debug_state = 1
> MPIR_partial_attach_ok = 1
> MPIR_i_am_starter = 0
> MPIR_proctable_size = 1
> MPIR_proctable:
> (i, host, exe, pid) = (0, wt217, /home/dab143/wrk/mpi/./mpi_hello, 22016)
> [wt217:22016] procdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/1/0
> [wt217:22016] jobdir: /tmp/openmpi-sessions-dab143_at_wt217_0/29144/1
> [wt217:22016] top: openmpi-sessions-dab143_at_wt217_0
> [wt217:22016] tmp: /tmp
> <hangs for approximately 64 seconds>
> [wt217:22016] [[29144,1],0] node[0].name wt217 daemon 0 arch ffc91200
> [wt217:22016] sess_dir_finalize: proc session dir not empty - leaving
> [wt217:22015] sess_dir_finalize: proc session dir not empty - leaving
> [wt217:22015] sess_dir_finalize: job session dir not empty - leaving
> [wt217:22015] sess_dir_finalize: proc session dir not empty - leaving
> orterun: e! xiting with status 0
>
> The code hangs for approximately 6! 4 second s after the line that reads "tmp: /tmp".
>
> If I attach gdb to the process during this time, the stack trace (attached) shows that the pause is in __GI___poll in /sysdeps/unix/sysv/linux/poll.c:83.
>
> If I add "-mca oob_tcp_if_exclude cscotun0", then the corresponding address for that vpn interface no longer shows up in contact.txt, but the problem remains. I also add "-mca btl ^cscotun0 -mca btl_tcp_if_exclude cscotun0" with no effect.
>
> Any idea what is hanging this up or how I can get more information as to what is going on during the pause? I assume connecting the vpn has caused mpi_init to look for something that isn't available and that eventually times out, but I don't know what.
>
> Output from ompi_info and the gdb stack trace is attached.
>
> Thanks,
> David
>
> <stack.txt.bz2><ompi_info.txt.bz2>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users