On 12 Feb 2011, at 14:06, Ralph Castain wrote:
> Have you searched the email archive and/or web for openmpi and Amazon cloud? Others have previously worked through many of these problems for that environment - might be worth a look to see if someone already solved this, or at least a contact point for someone who is already running in that environment.
I've run Open MPI on Amazon ec2 for over a year and never experienced any problems like the original poster describes.
> IIRC, there are some unique problems with running on that platform.
None that I'm aware of.
EC2 really is no different from any other environment I've used, either real or virtual, a simple download, ./configure, make and make install has always resulted in a working OpenMPI assuming a shared install location and home directory (for launching applications from).
When I'm using EC2 I tend to re-name machines into something that is easier to follow, typically "cloud[0-15].ec2" assuming I am running 16 machines, I change the hostname of each host and then write a /etc/hosts file to convert from hostname to internal IP address. I them export /home from cloud0.ec2 to all the other nodes and configure OpenMPI with --prefix=/home/ashley/install so that the code is installed everywhere.
For EC2 Instances I commonly use Fedora but have also used Ubuntu and Solaris, all have been fundamentally similar.
My other tip for using EC2 would be to use a persistent "home" folder by renting a disk partition and attaching it to the first instance you boot in a session. You pay for this by Gb/Month, I was able to use a 5Gb device which I mounted at /home in cloud0.ec2 and NFS exported to the other instances, again at /home. You'll need to add "ForwardAgent yes" to your personal .ssh/config to allow you to hop around inside the virtual cluster without entering a password. The persistent devices are called "Volumes" in EC2 speak, there is no need to create snapshots unless you want to share your volume with other people.
Ps, I would recommend reading up on sudo and su, "sudo su" is not a command you should be typing.
Ashley Pittman, Bath, UK.
Padb - A parallel job inspection tool for cluster computing