Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Number of processes and spawn
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-03-05 13:05:22


Hi Federico

I tested the trunk today and it works fine for me - I let it spin for 1000 cycles without issue. My test program is essentially identical to what you describe - you can see it in the orte/test/mpi directory. The "master" is loop_spawn.c, and the "slave" is loop_child.c. I only tested it on a single machine, though - will have to test multi-machine later. You might see if that makes a difference.

The error you report in your attachment is a classic symptom of mismatched versions. Remember, we don't forward your ld_lib_path, so it has to be correct on your remote machine.

As for r22794 - we don't keep anything that old on our web site. If you want to build it, the best way to get the code is to do a subversion checkout of the developer's trunk at that revision level:

svn co -r 22794 http://svn.open-mpi.org/svn/ompi/trunk

Remember to run autogen before configure.

On Mar 4, 2011, at 4:43 AM, Federico Golfrè Andreasi wrote:

>
> Hi Ralph,
>
> I'm getting stuck with spawning stuff,
>
> I've downloaded the snapshot from the trunk of 1st of March (openmpi-1.7a1r24472.tar.bz2),
> I'm testing using a small program that does the following:
> - master program starts and each rank prints his hostsname
> - master program spawn a slave program with the same size
> - each rank of the slave (spawned) program prints his hostname
> - end
> Not always he is able to complete the progam run, two different behaviour:
> 1. not all the slave print their hostname and the program ends suddenly
> 2. both program ends correctly but orted demon is still alive and I need to press crtl-c to exit
>
>
> I've tryed to recompile my test program with a previous snapshot (openmpi-1.7a1r22794.tar.bz2)
> where I have only the compiled version of OpenMPI (in another machine).
> It gives me an error before starting (I've attacehd)
> Surfing on the FAQ I found some tip and I verified to compile the program with the correct OpenMPI version,
> that the LD_LIBRARY_PATH is consistent.
> So I would like to re-compile the openmpi-1.7a1r22794.tar.bz2 but where can I found it ?
>
>
> Thank you,
> Federico
>
>
>
>
>
>
>
>
>
>
> Il giorno 23 febbraio 2011 03:43, Ralph Castain <rhc.openmpi_at_[hidden]> ha scritto:
> Apparently not. I will investigate when I return from vacation next week.
>
>
> Sent from my iPad
>
> On Feb 22, 2011, at 12:42 AM, Federico Golfrè Andreasi <federico.golfre_at_[hidden]> wrote:
>
>> Hi Ralf,
>>
>> I've tested spawning with the OpenMPI 1.5 release but that fix is not there.
>> Are you sure you've added it ?
>>
>> Thank you,
>> Federico
>>
>>
>>
>> 2010/10/19 Ralph Castain <rhc_at_[hidden]>
>> The fix should be there - just didn't get mentioned.
>>
>> Let me know if it isn't and I'll ensure it is in the next one...but I'd be very surprised if it isn't already in there.
>>
>>
>> On Oct 19, 2010, at 3:03 AM, Federico Golfrè Andreasi wrote:
>>
>>> Hi Ralf !
>>>
>>> I saw that the new realease 1.5 is out.
>>> I didn't found this fix in the "list of changes", is it present but not mentioned since is a minor fix ?
>>>
>>> Thank you,
>>> Federico
>>>
>>>
>>>
>>> 2010/4/1 Ralph Castain <rhc_at_[hidden]>
>>> Hi there!
>>>
>>> It will be in the 1.5.0 release, but not 1.4.2 (couldn't backport the fix). I understand that will come out sometime soon, but no firm date has been set.
>>>
>>>
>>> On Apr 1, 2010, at 4:05 AM, Federico Golfrè Andreasi wrote:
>>>
>>>> Hi Ralph,
>>>>
>>>>
>>>> I've downloaded and tested the openmpi-1.7a1r22817 snapshot,
>>>> and it works fine for (multiple) spawning more than 128 processes.
>>>>
>>>> That fix will be included in the next release of OpenMPI, right ?
>>>> Do you when it will be released ? Or where I can find that info ?
>>>>
>>>> Thank you,
>>>> Federico
>>>>
>>>>
>>>>
>>>> 2010/3/1 Ralph Castain <rhc_at_[hidden]>
>>>> http://www.open-mpi.org/nightly/trunk/
>>>>
>>>> I'm not sure this patch will solve your problem, but it is worth a try.
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> <OpenMPI.error>