On Jul 14, 2008, at 5:18 PM, Sean Hefty wrote:
>> Open MPI certainly could be buggy with IBCM, of course -- but it's
>> fishy that the same exact "mpirun ..." command line works one time
>> and
>> fails the next (it's kinda random when the problem occurs).
>
> I just want to make sure that service ID collision isn't the issue.
> (It may be
> unlikely, but it could happen.) Using the PID is random, and could
> cause
> conflicts with other services, depending on the value that's used.
> I know SDP
> reserve ranges of service ID values.
Ah! I did not realize that there were other services on the machine
that were using / reserving IBCM service ID's.
Is there a service ID range that is guaranteed to be available for
user apps?
> Is the service ID specified in host or network order?
Host order -- just the result of getpid().
> Do you know the range of
> PIDs? I can see if any well known apps might collide.
I never looked at the range of PIDs that failed. Pasha / Brad --
could you look into this? It might be that simple...
--
Jeff Squyres
Cisco Systems
|