Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-02-07 09:36:43

On Feb 6, 2006, at 5:25 PM, Warner Yuen wrote:

> Brian help!!!!!! :-)
> On Feb 5, 2006, at 9:00 AM, users-request_at_[hidden] wrote:
>>> If this is the case, my next question is, how do I supply the usual
>>> xgrid options, such as working directory, standard input file, etc?
>>> Or is that simply not possible?
>>> Do I simply have to have some other way (eg ssh) to get files to/
>>> from agent machines, like I would for a batch system like PBS?
>> It looks like I never implemented those options (shame on me). I've
>> added that to my to-do list, although I can't give an accurate time-
>> table for implementation at this point. One thing to note is that
>> rather than using XGrid's standard input/output forwarding services,
>> we use Open MPI's services. So if you do:
>> mpirun -np 2 ./myapp < foo.txt
> Under Xgrid with Open MPI, I'm trying to run applications that
> require more than just reading standard input/output but also the
> creation and writing of other intermediate files. For an
> application that like HP Linpack that just reads and writes one
> file, things work fine. My guess is that this is where things are
> getting hung up. Below, my application was trying to write out a
> file called "testrun.nex.run1.p" and failed. The MrBayes
> application writes out two files for each mpi process.
> Initial log likelihoods for run 1:
> Chain 1 -- -429.987779
> Chain 2 -- -386.761468
> Could not open file "testrun.nex.run1.p"
> Memory allocation error on at least one processor
> Error in command "Mcmc"
> There was an error on at least one processor
> Error in command "Execute"
> Will exit with signal 1 (error) because quitonerror is set to yes
> Am I just misunderstanding how to set up Open MPI to work with Xgrid?

Ah, yes, this would make sense. When password authentication is used
to authenticate to an XGrid controller, all jobs run as user
'nobody'. So all the files that MrBayes (for example) are trying to
read/write must have permissions for user 'nobody'. If the files
only need to be read, making them (and your home directory itself)
world readable is an option. If the files need to be written, then
there's a bit of a problem, since you probably (in general) don't
want to allow user nobody to write all over your home directory. One
solution (if possible) would be to have the application write into /
tmp and then collect the files after the job completes.

If kerberos authentication (aka Single Signon) is used for controller
authentication, then the processes started by XGrid run as the user
who submitted the job. This makes I/O on the compute nodes
significantly easier, but setting up the grid is more difficult. All
the computers have to use the same kerberos authentication realm, and
I think there are some other restrictions. Also, because I didn't
have access to such a setup, Open MPI 1.0.x does not support process
startup with single signon authentication. This is something I'm
hoping to have fixed for Open MPI 1.1, if I can find a properly
configured cluster to test on.

Hope this made some sense...


   Brian Barrett
   Open MPI developer