This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
You are correct this is a ROCKS cluster. I didn't use the the --sge option when building (I tend to stay more generic, but I should have done that).
Not sure of the OFED release but I don't admin this cluster and the owners are picky about upgrades (tends to break Lustre).
BTW - the problem was solved. There was a configuration error for the specific queue. It was found and fixed and things seem to be running normally.
Thanks for help and I'm sorry for disturbing everyone. I wasn't familiar enough with the error messages to tell if it was OpenMPI or SGE.
From: Joe Landman <landman_at_[hidden]>
To: Open MPI Users <users_at_[hidden]>
Sent: Monday, June 1, 2009 3:34:40 PM
Subject: Re: [OMPI users] Problem getting OpenMPI to run
Jeff Layton wrote:
> Jeff Squyres wrote:
>> On Jun 1, 2009, at 2:04 PM, Jeff Layton wrote:
>>> error: executing task of job 3084 failed: execution daemon on host
>>> "compute-2-2.local" didn't accept task
>> This looks like an error message from the resource manager/scheduler -- not from OMPI (i.e., OMPI tried to launch a process on a node and the launch failed because something rejected it).
>> Which one are you using?
When you built Open-MPI, did you use the
switch? Or if this is an OFED release, is it possible that this wasn't specified?
FWIW, this looks like a Rocks compute node ("compute-2-2.local" gives that away). The OFED Rolls in Rocks have had a few issues in the past with how they were built, so you may be running into that. If you didn't build it yourself, I'd suggest at least giving that a try.
Alternatively, OFED-1.4 is pretty good. Has a later version of Open-MPI than 1.3.x
> users mailing list
-- Joseph Landman, Ph.D
Founder and CEO
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
users mailing list