Some good news and bad news. According to the information provided
I have enabled X11Forward on all remote nodes, and
added the path to mpirun, which is "/usr/local/bin", on all node, and
"xhost +" on my localhost, and
set the DISPLAY on all remote nodes as the DISPLAY value of my localhost.
I used "mpirun -n [numberPros] --hostfile [filename] arg .." to start
the job on my localhost, it was still not working,
But, when I explicitly added the "--prefix /usr/local" and "-x
DISPLAY=[localDISPLAYvalue]" to start the job, everything was working: all
xwindows opened by remote nodes were displayed on my localhost machine. I
was so excited! Moreover, with all these --prefix and -x options to do
mpirun through XGrid (i.e. without adding --hostfile ) , it was still
Could you please tell me what things that I missed in setting all remote
nodes as I mentioned in above, so I don't have to type all the options to
start this job?
(I have also tried to add the prefix "/usr/local" to the PATH of each
remote node as well, it was still not working if without --prefix option).
Thanks for any help.
On 10/26/07, Brian Barrett <brbarret_at_[hidden]> wrote:
> XGrid does not forward X11 credentials, so you would have to setup an
> X11 environment by yourself. Using ssh or a local starter does
> forward X11 credentials, which is why it works in that case.
> On Oct 25, 2007, at 10:23 PM, Jinhui Qin wrote:
> > Hi Brian,
> > I got another problem in running an MPI job through XGrid.
> > During the execution of this MPI job it will call Xlib functions
> > (i.e. XOpenDisplay()) to open an X window. The XOpenDisplay()
> > function call failed (return "null"), it can not open a display no
> > matter how many processors that I requested.
> > However, when I tuned off the xgrid controller, I used "mpirun -n 4
> > " to start the job again, four X windows opened properly, but four
> > processes were all running on the local machine instead of on any
> > remote nodes.
> > I have also tested to use "ssh -x" from a terminal of my local
> > machine to login to any other node in the cluster to run the job
> > (I have the copies of the same job on all nodes and in the same
> > path), the X window can display on my local machine properly. I
> > know it is "-x" option set up the environment properly for starting
> > the xwindow. If only use "ssh" without "-x" option, it won't work.
> > I am wondering why the xwindow can not open if the job is started
> > through Xgrid. How does the Xgrid controller contact to each agent
> > node?
> > Is there anyone who has seen a similar problem?
> > I have installed X11 and OpenMPI on all 8 mac mini nodes in my
> > cluster, and have also tested running an MPI job, which has no X
> > window function calls, through XGrid, it worked perfectly fine on
> > all nodes.
> > Thanks a lot for any suggestions!
> > Jane
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> devel mailing list