I've used OMPI 1.0.1 with Xgrid. I don't think I ran into the same problem as you with the job hanging. But I'll continue just in case it helps or helps someone else. The one thing that I noticed was that Xgrid/OMPI does not allow an MPI application to write out a file other than to standard output.
In my example when running HP Linpack over an Xgrid enabled OMPI, if I execute the mpirun with HPL just outputting to the screen, everything runs fine. However, if I set my hpl.dat file to write out the results to a file, I get an error:
With 'hpl.dat' set to write to an output file called 'HPL.out' after executing: mpirun -d -hostfile myhosts -np 4 ./xhpl
portal.private:00545] [0,1,0] ompi_mpi_init completed
HPL ERROR from process # 0, on line 318 of function HPL_pdinfo:
>>> cannot open file HPL.out. <<<
I've tested this with a couple of other applications as well. For now, the only way I can solve it is if I set my working directory to allow user nobody to write to my working directory. Hope this helps.
Date: Mon, 20 Mar 2006 08:11:32 +0100
From: Frank <firstname.lastname@example.org>
Subject: Re: [OMPI users] Mac OS X 10.4.5 and XGrid, Open-MPI V1.0.1
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
this is the full -d option output I've got mpi-running vhone on the
xgrid. The truncation is due to the reported "hang".