Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problems with the -xterm option
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-04-28 11:04:56


No immediate suggestions - I won't get a chance to test this until later as I don't normally run an x11 server on my box, and don't have another way to test it.

On Apr 28, 2011, at 8:38 AM, jody wrote:

> Hi
>
> Unfortunately this does not solve my problem.
> While i can do
> ssh -Y squid_0 xterm
> and this will open an xterm on m,y machiine (chefli),
> i run into problems with the -xterm option of openmpi:
>
> jody_at_chefli ~/share/neander $ mpirun -np 4 -mca plm_rsh_agent "ssh
> -Y" -host squid_0 --xterm 1 hostname
> squid_0
> [squid_0:28046] [[35219,0],1]->[[35219,0],0]
> mca_oob_tcp_msg_send_handler: writev failed: Bad file descriptor (9)
> [sd = 8]
> [squid_0:28046] [[35219,0],1] routed:binomial: Connection to
> lifeline [[35219,0],0] lost
> [squid_0:28046] [[35219,0],1]->[[35219,0],0]
> mca_oob_tcp_msg_send_handler: writev failed: Bad file descriptor (9)
> [sd = 8]
> [squid_0:28046] [[35219,0],1] routed:binomial: Connection to
> lifeline [[35219,0],0] lost
> /usr/bin/xterm Xt error: Can't open display: localhost:11.0
>
> By the way when i look at the DISPLAY variable in the xterm window
> opened via squid_0,
> i also have the display variable "localhost:11.0"
>
> Actually, the difference with using the "-mca plm_rsh_agent" is that
> the lines wiht the warnings about "xauth" and "untrusted X" do not
> appear:
>
> jody_at_chefli ~/share/neander $ mpirun -np 4 -host squid_0 -xterm 1 hostname
> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
> Warning: No xauth data; using fake authentication data for X11 forwarding.
> squid_0
> [squid_0:28337] [[34926,0],1]->[[34926,0],0]
> mca_oob_tcp_msg_send_handler: writev failed: Bad file descriptor (9)
> [sd = 8]
> [squid_0:28337] [[34926,0],1] routed:binomial: Connection to
> lifeline [[34926,0],0] lost
> [squid_0:28337] [[34926,0],1]->[[34926,0],0]
> mca_oob_tcp_msg_send_handler: writev failed: Bad file descriptor (9)
> [sd = 8]
> [squid_0:28337] [[34926,0],1] routed:binomial: Connection to
> lifeline [[34926,0],0] lost
> /usr/bin/xterm Xt error: Can't open display: localhost:11.0
>
>
> I have doubts that the "-Y" is passed correctly:
> jody_at_triops ~/share/neander $ mpirun -np -mca plm_rsh_agent "ssh
> -Y" -host squid_0 xterm
> xterm Xt error: Can't open display:
> xterm: DISPLAY is not set
> xterm Xt error: Can't open display:
> xterm: DISPLAY is not set
>
>
> ---> as a matter of fact i noticed that the xterm option doesn't work locally:
> mpirun -np 4 -xterm 1 /usr/bin/printenv
> prints verything onto the console.
>
> Do you have any other suggestions i could try?
>
> Thank You
> Jody
>
> On Thu, Apr 28, 2011 at 3:06 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>> Should be able to just set
>>
>> -mca plm_rsh_agent "ssh -Y"
>>
>> on your cmd line, I believe
>>
>> On Apr 28, 2011, at 12:53 AM, jody wrote:
>>
>>> Hi Ralph
>>>
>>> Is there an easy way i could modify the OpenMPI code so that it would use
>>> the -Y option for ssh when connecting to remote machines?
>>>
>>> Thank You
>>> Jody
>>>
>>> On Thu, Apr 7, 2011 at 4:01 PM, jody <jody.xha_at_[hidden]> wrote:
>>>> Hi Ralph
>>>> thank you for your suggestions. After some fiddling, i found that after my
>>>> last update (gentoo) my sshd_config had been overwritten
>>>> (X11Forwarding was set to 'no').
>>>>
>>>> After correcting that, i can now open remote terminals with 'ssh -Y'
>>>> and with 'ssh -X'
>>>> (but with '-X' is till get those xauth warnings)
>>>>
>>>> But the xterm option still doesn't work:
>>>> jody_at_chefli ~/share/neander $ mpirun -np 4 -host squid_0 -xterm 1,2
>>>> printenv | grep WORLD_RANK
>>>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
>>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>>> /usr/bin/xterm Xt error: Can't open display: localhost:11.0
>>>> /usr/bin/xterm Xt error: Can't open display: localhost:11.0
>>>> OMPI_COMM_WORLD_RANK=0
>>>> [aim-squid_0:09856] [[54132,0],1]->[[54132,0],0]
>>>> mca_oob_tcp_msg_send_handler: writev failed: Bad file descriptor (9)
>>>> [sd = 8]
>>>> [aim-squid_0:09856] [[54132,0],1] routed:binomial: Connection to
>>>> lifeline [[54132,0],0] lost
>>>>
>>>> So it looks like the two processes from squid_0 can't open the display this way,
>>>> but one of them writes the output to the console...
>>>> Surprisingly, they are trying 'localhost:11.0' whereas when i use 'ssh -Y' the
>>>> DISPLAY variable is set to 'localhost:10.0'
>>>>
>>>> So in what way would OMPI have to be adapted, so -xterm would work?
>>>>
>>>> Thank You
>>>> Jody
>>>>
>>>> On Wed, Apr 6, 2011 at 8:32 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>> Here's a little more info - it's for Cygwin, but I don't see anything
>>>>> Cygwin-specific in the answers:
>>>>> http://x.cygwin.com/docs/faq/cygwin-x-faq.html#q-ssh-no-x11forwarding
>>>>>
>>>>> On Apr 6, 2011, at 12:30 PM, Ralph Castain wrote:
>>>>>
>>>>> Sorry Jody - I should have read your note more carefully to see that you
>>>>> already tried -Y. :-(
>>>>> Not sure what to suggest...
>>>>>
>>>>> On Apr 6, 2011, at 12:29 PM, Ralph Castain wrote:
>>>>>
>>>>> Like I said, I'm not expert. However, a quick "google" of revealed this
>>>>> result:
>>>>>
>>>>> When trying to set up x11 forwarding over an ssh session to a remote server
>>>>> with the -X switch, I was getting an error like Warning: No xauth
>>>>> data; using fake authentication data for X11 forwarding.
>>>>>
>>>>> When doing something like:
>>>>> ssh -Xl root 10.1.1.9 to a remote server, the authentication worked, but I
>>>>> got an error message like:
>>>>>
>>>>>
>>>>> jason_at_badman ~/bin $ ssh -Xl root 10.1.1.9
>>>>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
>>>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>>>> Last login: Wed Apr 14 18:18:39 2010 from 10.1.1.5
>>>>> [root_at_RHEL ~]#
>>>>> and any X programs I ran would not display on my local system..
>>>>>
>>>>> Turns out the solution is to use the -Y switch instead.
>>>>>
>>>>> ssh -Yl root 10.1.1.9
>>>>>
>>>>> and that worked fine.
>>>>>
>>>>> See if that works for you - if it does, we may have to modify OMPI to
>>>>> accommodate.
>>>>>
>>>>> On Apr 6, 2011, at 9:19 AM, jody wrote:
>>>>>
>>>>> Hi Ralph
>>>>> No, after the above error message mpirun has exited.
>>>>>
>>>>> But i also noticed that it is to ssh into squid_0 and open a xterm there:
>>>>>
>>>>> jody_at_chefli ~/share/neander $ ssh -Y squid_0
>>>>> Last login: Wed Apr 6 17:14:02 CEST 2011 from chefli.uzh.ch on pts/0
>>>>> jody_at_squid_0 ~ $ xterm
>>>>> xterm Xt error: Can't open display:
>>>>> xterm: DISPLAY is not set
>>>>> jody_at_squid_0 ~ $ export DISPLAY=130.60.126.74:0.0
>>>>> jody_at_squid_0 ~ $ xterm
>>>>> xterm Xt error: Can't open display: 130.60.126.74:0.0
>>>>> jody_at_squid_0 ~ $ export DISPLAY=chefli.uzh.ch:0.0
>>>>> jody_at_squid_0 ~ $ xterm
>>>>> xterm Xt error: Can't open display: chefli.uzh.ch:0.0
>>>>> jody_at_squid_0 ~ $ exit
>>>>> logout
>>>>>
>>>>> same thing with ssh -X, but here i get the same warning/error message
>>>>> as with mpirun:
>>>>>
>>>>> jody_at_chefli ~/share/neander $ ssh -X squid_0
>>>>> Warning: untrusted X11 forwarding setup failed: xauth key data not
>>>>> generated
>>>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>>>> Last login: Wed Apr 6 17:12:31 CEST 2011 from chefli.uzh.ch on ssh
>>>>>
>>>>> So perhaps the whole problem is linked to that xauth-thing.
>>>>> Do you have a suggestion how this can be solved?
>>>>>
>>>>> Thank You
>>>>> Jody
>>>>>
>>>>> On Wed, Apr 6, 2011 at 4:41 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>>
>>>>> If I read your error messages correctly, it looks like mpirun is crashing -
>>>>> the daemon is complaining that it lost the socket connection back to mpirun,
>>>>> and hence will abort.
>>>>>
>>>>> Are you seeing mpirun still alive?
>>>>>
>>>>>
>>>>> On Apr 5, 2011, at 4:46 AM, jody wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> On my workstation and the cluster i set up OpenMPI (v 1.4.2) so that
>>>>>
>>>>> it works in "text-mode":
>>>>>
>>>>> $ mpirun -np 4 -x DISPLAY -host squid_0 printenv | grep WORLD_RANK
>>>>>
>>>>> OMPI_COMM_WORLD_RANK=0
>>>>>
>>>>> OMPI_COMM_WORLD_RANK=1
>>>>>
>>>>> OMPI_COMM_WORLD_RANK=2
>>>>>
>>>>> OMPI_COMM_WORLD_RANK=3
>>>>>
>>>>> but when i use the -xterm option to mpirun, it doesn't work
>>>>>
>>>>> $ mpirun -np 4 -x DISPLAY -host squid_0 -xterm 1,2 printenv | grep
>>>>> WORLD_RANK
>>>>>
>>>>> Warning: untrusted X11 forwarding setup failed: xauth key data not
>>>>> generated
>>>>>
>>>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>>>>
>>>>> OMPI_COMM_WORLD_RANK=0
>>>>>
>>>>> [squid_0:05266] [[55607,0],1]->[[55607,0],0]
>>>>>
>>>>> mca_oob_tcp_msg_send_handler: writev failed: Bad file descriptor (9)
>>>>>
>>>>> [sd = 8]
>>>>>
>>>>> [squid_0:05266] [[55607,0],1] routed:binomial: Connection to
>>>>>
>>>>> lifeline [[55607,0],0] lost
>>>>>
>>>>> /usr/bin/xterm Xt error: Can't open display: chefli.uzh.ch:0.0
>>>>>
>>>>> /usr/bin/xterm Xt error: Can't open display: chefli.uzh.ch:0.0
>>>>>
>>>>> (strange: somebody wrote his message to the console)
>>>>>
>>>>> No matter whether i set the DISPLAY variable to the full hostname of
>>>>>
>>>>> the workstation,
>>>>>
>>>>> to the IP-Adress of the workstation or simply to ":0.0", it doesn't work
>>>>>
>>>>> But i do have xauth data (as far as i know):
>>>>>
>>>>> On the remote (squid_0):
>>>>>
>>>>> jody_at_squid_0 ~ $ xauth list
>>>>>
>>>>> chefli/unix:10 MIT-MAGIC-COOKIE-1 5293e179bc7b2036d87cbcdf14891d0c
>>>>>
>>>>> chefli/unix:0 MIT-MAGIC-COOKIE-1 146c7f438fab79deb8a8a7df242b6f4b
>>>>>
>>>>> chefli.uzh.ch:0 MIT-MAGIC-COOKIE-1 146c7f438fab79deb8a8a7df242b6f4b
>>>>>
>>>>> on the workstation:
>>>>>
>>>>> $ xauth list
>>>>>
>>>>> chefli/unix:10 MIT-MAGIC-COOKIE-1 5293e179bc7b2036d87cbcdf14891d0c
>>>>>
>>>>> chefli/unix:0 MIT-MAGIC-COOKIE-1 146c7f438fab79deb8a8a7df242b6f4b
>>>>>
>>>>> localhost.localdomain/unix:0 MIT-MAGIC-COOKIE-1
>>>>>
>>>>> 146c7f438fab79deb8a8a7df242b6f4b
>>>>>
>>>>> chefli.uzh.ch/unix:0 MIT-MAGIC-COOKIE-1 146c7f438fab79deb8a8a7df242b6f4b
>>>>>
>>>>> In sshd_config on the workstation i have 'X11Forwarding yes'
>>>>>
>>>>> I have also done
>>>>>
>>>>> xhost + squid_0
>>>>>
>>>>> on the workstation.
>>>>>
>>>>>
>>>>> How can i get the -xterm option running?
>>>>>
>>>>> Thank You
>>>>>
>>>>> Jody
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> users mailing list
>>>>>
>>>>> users_at_[hidden]
>>>>>
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> users mailing list
>>>>>
>>>>> users_at_[hidden]
>>>>>
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users