Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I integrate OpenMPI with a local clusterand EC2
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-04-19 08:08:45


To be clear, Open MPI essentially requires the ability to open random TCP ports between the nodes used in the job (it's actually a little less restrictive than that, but it's easier to describe that requirement than the actual, less-restrictive requirements).

On Apr 17, 2010, at 10:03 PM, Ralph Castain wrote:

> I'm afraid you'll have to ask the EC2 folks - you probably need something to get through their firewall.
>
> If you just try "ssh ec2-174-129-183-64.compute-1.amazonaws.com hostname", does that work? I would just try to make that work first - once it does, so should mpirun.
>
> On Apr 17, 2010, at 4:39 PM, Theodore Van Rooy wrote:
>
>> Hi all,
>>
>> I'm trying to add EC2 instances into my local cluster with openMPI. So far openMPI works well on the local cluster, and I have set up passwordless SSH between the local cluster and the Amazon EC2 instance.
>>
>> Howver, when I add the public DNS into a file (defaulthostfiletest)
>>
>> comp1 slots=2 max-slots=8
>> comp2 slots=2 max-slots=8
>> comp3 slots=2 max-slots=4
>> ec2-174-129-183-64.compute-1.amazonaws.com slots=2 max-slots=2
>>
>> and then run:
>>
>> [/home/ntlp/cashmoney/mainFrame]$mpirun -np 6 --hostfile defaulthostfiletest hostname
>> foretell
>> foretell
>> augur
>> augur
>> predict
>> predict
>>
>> it works, but trying to use the amazon cluster I get:
>>
>> [/home/ntlp/cashmoney/mainFrame]$mpirun -np 8 --hostfile defaulthostfiletest hostname (it hangs so I kill it)
>> ^C^Cmpirun: killing job...
>>
>> --------------------------------------------------------------------------
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> mpirun was unable to cleanly terminate the daemons on the nodes shown
>> below. Additional manual cleanup may be required - please refer to
>> the "orte-clean" tool for assistance.
>> --------------------------------------------------------------------------
>> ec2-174-129-183-64.compute-1.amazonaws.com - daemon did not report back when launched
>>
>> Any advice? are there any settings in /etc/sssh/sshd_config that I might need to change?
>>
>> Theo
>> --
>> Theodore Van Rooy
>> http://greentheo.scroggles.com
>>
>>
>>
>> --
>> Theodore Van Rooy
>> http://greentheo.scroggles.com
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/