Hi Brian,

       Some good news and bad news. According to  the information provided on http://www.open-mpi.org/faq/?category=running,

I have enabled X11Forward on all remote nodes, and
added the path to mpirun, which is "/usr/local/bin", on all node, and
 "xhost +" on my localhost, and
set the DISPLAY on all remote nodes as the DISPLAY value of my localhost. Then,
 I used "mpirun -n [numberPros] --hostfile [filename] arg[0] .."  to start the job on my localhost, it was still not working,

 But,  when I explicitly added the "--prefix /usr/local" and "-x DISPLAY=[localDISPLAYvalue]" to start the job, everything was working: all xwindows opened by remote nodes were displayed on my localhost machine. I was so excited! Moreover, with all these --prefix and -x options to do mpirun through XGrid (i.e. without adding --hostfile ) , it was still working!


Could you please tell me what things that I missed in setting all remote nodes as I mentioned in above, so I don't  have to type all the options to start this job?

 (I have also tried to add the prefix "/usr/local" to the PATH of each remote node as well, it was still not working  if without  --prefix option).

Thanks for any help.

Jane


On 10/26/07, Brian Barrett <brbarret@open-mpi.org> wrote:
XGrid does not forward X11 credentials, so you would have to setup an
X11 environment by yourself.  Using ssh or a local starter does
forward X11 credentials, which is why it works in that case.

Brian

On Oct 25, 2007, at 10:23 PM, Jinhui Qin wrote:

> Hi Brian,
>    I got another problem in running an MPI job through XGrid.
> During the execution of this MPI job it will call Xlib functions
> ( i.e. XOpenDisplay()) to open an X window.  The XOpenDisplay()
> function call failed (return "null"), it can not open a display no
> matter how many processors that I requested.
>
> However, when I tuned off the xgrid controller, I used "mpirun -n 4
> " to start the job again, four X windows opened properly, but four
> processes were all running on the local machine instead of on any
> remote nodes.
>
> I have also tested to use "ssh -x" from a terminal of my local
> machine to login to any other node in the cluster  to run the job
> (I have the copies of the same job on all nodes and in the same
> path), the X window can display on my local machine  properly. I
> know it is "-x" option set up the environment properly for starting
> the xwindow. If only use "ssh" without "-x" option, it won't work.
>
> I am wondering why the xwindow can not open if the job is started
> through Xgrid.  How does the Xgrid controller contact to each agent
> node?
>
> Is there anyone who has seen a similar problem?
>
> I have installed X11 and OpenMPI on all 8 mac mini nodes in my
> cluster, and have also tested running an  MPI job,  which  has no X
> window function calls, through XGrid, it worked perfectly fine on
> all nodes.
>
> Thanks a lot for any suggestions!
>
> Jane
>
>
> _______________________________________________
> devel mailing list
> devel@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel