Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Number of processes and spawn
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-03-01 09:41:51


http://www.open-mpi.org/nightly/trunk/

I'm not sure this patch will solve your problem, but it is worth a try.

On Mar 1, 2010, at 3:51 AM, Federico Golfrè Andreasi wrote:

> Ok, thank you !
>
> where can I found instructions for download the developer's copy of OpenMPI, if it is possibile?
>
> I'd like to test it just to be sure that the problem is solved, with that patch.
>
> Can you let me know where that patch is available?
>
> Thank you very much,
>
> Federico
>
>
>
>
> 2010/2/27 Ralph Castain <rhc_at_[hidden]>
> Okay, thanks. It's the same problem as the other person encountered. Basically, it looks to OMPI as if you are launching > 128 independent app contexts, and our arrays were limited to 128.
>
> He has provided a patch that I'll review (couple of things I'd rather change) and then apply to our developer's trunk. I would expect it to migrate over to the 1.4 release series at some point (can't guarantee which one).
>
>
> On Feb 27, 2010, at 6:47 AM, Federico Golfrè Andreasi wrote:
>
>> Hi,
>>
>> the program is executed as one application on 129 cpus defined by the hostfile.
>> Than rank 0, inside the code, execute another program with 129 cpus, with a one-to-one relation, rank0 of the spawined process runs on the same host of rank0 of the spawning one and so on...
>> Excuting the spawning program does not give any problem,
>> but in the moment of spawning (with more than 128 cpus) it holds.
>>
>> Thank you!
>>
>> Federico
>>
>>
>>
>>
>> 2010/2/27 Ralph Castain <rhc_at_[hidden]>
>> Since another user was doing something that caused a similar problem, perhaps we are missing a key piece of info here. Are you launching one app_context across 128 nodes? Or are you launching 128 app_contexts, each on a separate node?
>>
>>
>> On Feb 26, 2010, at 10:23 AM, Federico Golfrè Andreasi wrote:
>>
>>> I'm doing some tests and it seems that is not able to do a spawn multiple with more than 128 nodes.
>>>
>>> It just hold, with no error message.
>>>
>>> What do you think? What can I try to understand the problem.
>>>
>>> Thanks,
>>>
>>> Federico
>>>
>>>
>>>
>>>
>>> 2010/2/26 Ralph Castain <rhc_at_[hidden]>
>>> No known limitations of which we are aware...the variables are all set to int32_t, so INT32_MAX would be the only limit I can imagine. In which case, you'll run out of memory long before you hit it.
>>>
>>>
>>> 2010/2/26 Federico Golfrè Andreasi <federico.golfre_at_[hidden]>
>>> HI !
>>>
>>> have you ever did some analysis to understand if there is a limitation in the number of nodes usable with OpenMPI-v1.4 ?
>>> Using also the functions MPI_Comm_spawn o MPI_Comm_spawn_multiple.
>>>
>>> Thanks,
>>> Federico
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users