Joe,

You are correct this is a ROCKS cluster. I didn't use the the --sge option when building (I tend to stay more generic, but I should have done that).

Not sure of the OFED release but I don't admin this cluster and the owners are picky about upgrades (tends to break Lustre).

BTW - the problem was solved. There was a configuration error for the specific queue. It was found and fixed and things seem to be running normally.

Thanks for help and I'm sorry for disturbing everyone. I wasn't familiar enough with the error messages to tell if it was OpenMPI or SGE.

TIA!

Jeff



From: Joe Landman <landman@scalableinformatics.com>
To: Open MPI Users <users@open-mpi.org>
Sent: Monday, June 1, 2009 3:34:40 PM
Subject: Re: [OMPI users] Problem getting OpenMPI to run

Jeff Layton wrote:
> Jeff Squyres wrote:
>> On Jun 1, 2009, at 2:04 PM, Jeff Layton wrote:
>>
>>> error: executing task of job 3084 failed: execution daemon on host
>>> "compute-2-2.local" didn't accept task
>>>
>>
>> This looks like an error message from the resource manager/scheduler -- not from OMPI (i.e., OMPI tried to launch a process on a node and the launch failed because something rejected it).
>>
>> Which one are you using?

When you built Open-MPI, did you use the

    --with-sge

switch?  Or if this is an OFED release, is it possible that this wasn't specified?

FWIW, this looks like a Rocks compute node ("compute-2-2.local" gives that away).  The OFED Rolls in Rocks have had a few issues in the past with how they were built, so you may be running into that.  If you didn't build it yourself, I'd suggest at least giving that a try.

Alternatively, OFED-1.4 is pretty good.  Has a later version of Open-MPI than 1.3.x

Joe

>
> SGE
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics,
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
      http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users