Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error on running large number of processes
From: Pak Lui (Pak.Lui_at_[hidden])
Date: 2007-11-15 13:38:30


I am assuming all the processes are running on a single SMP? Not sure if
you have tried it but you may want to set the mpool_sm_max_size to
something other than the default 512MB, since you seem to be using
shared memory?

Jeff Squyres wrote:
> My guess is that this is similar to the last post: you are
> oversubscribing the nodes so heavily that the OS is running out of
> some resources (perhaps regular or registered memory?) such that Open
> MPI is unable to setup its network transport layers properly.
>
>
> On Nov 15, 2007, at 6:35 AM, Clement Kam Man Chu wrote:
>
>> Hi,
>>
>> I am using openmpi 1.2.3 under ia64 machine and uses pbs job
>> scheduler. I can successfully run 100 processes on 16 cpus, but I got
>> an error If run 200 processes on the same number of cpus. The error
>> is :
>>
>> PML add procs failed
>> --> Returned "Temporarily out of resource" (-3) instead of
>> "Success" (0)
>>
>>
>> Please help.
>>
>> Regards,
>> Clement
>>
>> --
>> Clement Kam Man Chu
>> Research Assistant
>> Faculty of Information Technology
>> Monash University, Caulfield Campus
>> Ph: 61 3 9903 2355
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

- Pak Lui
pak.lui_at_[hidden]