Thanks to several people who tried to help to diagnose, and shared your thoughts, on this subject thread. That gave me more clues and courage to talk back to our vender.
My question on the Torque list is still pending for replies...
Best regards to you all,
At 11:22 AM 9/30/2008 +1000, you wrote:
>On Mon, 2008-09-29 at 17:30 -0500, Zhiliang Hu wrote:
>> >As you blank out some addresses: have the nodes and the headnode one
>> >or two network cards installed? All the names like node001 et al. are
>> >known on neach node by the correct address? I.e. 172.16.100.1 = node001?
>> >-- Reuti
>> There should be no problem in this regard -- the set up is by a
>> commercial company. I can ssh from any node to any node (passwdless).
>Your faith in commercial enterprises is touching. Unfortunately, it's
>at odds with my experience, on two continents.
>Like Reuti said, if you paid someone to set up a cluster to run parallel
>jobs and it won't run parallel jobs, then yell at them loud and long.
>I'll also reiterate that this sounds like a PBS problem rather than
>(yet) an OpenMPI problem. It seems you left the PBS discussion
>users mailing list