Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Number of processes and spawn
From: Federico Golfrè Andreasi (federico.golfre_at_[hidden])
Date: 2011-03-29 06:32:42


Hi Ralf,

sorry to bother you again,
I've download the new beta version: *OpenMPI-1.5.3*
and it seems to me that the bug fix for spawn_multiple with more than 128
cpus is not there,
am I correct ?

Thanks,
Federico.

2011/3/14 Ralph Castain <rhc_at_[hidden]>

> You can try running it as suggested here:
>
> http://www.open-mpi.org/faq/?category=running#mpirun-prefix
>
> <http://www.open-mpi.org/faq/?category=running#mpirun-prefix>
> On Mar 14, 2011, at 12:21 PM, Federico Golfrè Andreasi wrote:
>
>
> Thank you Jeff,
>
> my fault :( I didn't find that a link of that file was also mentioned in
> the website page.
> I was able to build from the trunk the revision 24472.
> But when I try to run my program I still receive the error that Ralph told
> me is dued to version mismacth.
>
> How can I check which openmpi version my program is running in the remore
> shell ?
>
> I execute my programs using the command
> /home/fandreasi/openmpi-trunk/bin/mpiexec -hostfile ./hostfile -n 12
> ./my_bin
> And in my .cshrc I've the instruction:
> setenv LD_LIBRARY_PATH /home/fandreasi/openmpi-trunk/lib:
> /home/fandreasi/openmpi-trunk/lib/openmpi
>
> thank you again !
> Federico
>
>
>
>
>
> 2011/3/10 Jeff Squyres <jsquyres_at_[hidden]>
>
>> This usually means you didn't install the GNU auto tools properly.
>>
>> Check the HACKING file in the top-level directory for specific
>> instructions on how to install the Autotools.
>>
>>
>> On Mar 10, 2011, at 7:50 AM, Federico Golfrè Andreasi wrote:
>>
>> >
>> > Hi Ralph,
>> >
>> > I did a chekout of the 22794 revision with svn.
>> > I've download and installed (with the default configuration) in my /home
>> folder:
>> > - m4 version 1.4.16
>> > - autoconf version 2.68
>> > - automake version 1.11
>> > - libtool version 2.2.6b
>> > I've modifyed my CSHRC to export the following:
>> > setenv PATH
>> /home/fandreasi/m4-1.4.16/bin:/home/fandreasi/autoconf-2.68/bin:/home/fandreasi/automake-1.11/bin:/home/fandreasi/libtool-2.2.6b/bin:$PATH
>> > setenv LD_LIBRARY_PATH /home/fandreasi/libtool-2.2.6b/lib
>> >
>> > When I do the autogen it return me the error I've attached.
>> > Can you help me on this ?
>> >
>> > Thank you,
>> > Federico.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Il giorno 05 marzo 2011 19:05, Ralph Castain <rhc_at_[hidden]> ha
>> scritto:
>> > Hi Federico
>> >
>> > I tested the trunk today and it works fine for me - I let it spin for
>> 1000 cycles without issue. My test program is essentially identical to what
>> you describe - you can see it in the orte/test/mpi directory. The "master"
>> is loop_spawn.c, and the "slave" is loop_child.c. I only tested it on a
>> single machine, though - will have to test multi-machine later. You might
>> see if that makes a difference.
>> >
>> > The error you report in your attachment is a classic symptom of
>> mismatched versions. Remember, we don't forward your ld_lib_path, so it has
>> to be correct on your remote machine.
>> >
>> > As for r22794 - we don't keep anything that old on our web site. If you
>> want to build it, the best way to get the code is to do a subversion
>> checkout of the developer's trunk at that revision level:
>> >
>> > svn co -r 22794 http://svn.open-mpi.org/svn/ompi/trunk
>> >
>> > Remember to run autogen before configure.
>> >
>> >
>> > On Mar 4, 2011, at 4:43 AM, Federico Golfrè Andreasi wrote:
>> >
>> >>
>> >> Hi Ralph,
>> >>
>> >> I'm getting stuck with spawning stuff,
>> >>
>> >> I've downloaded the snapshot from the trunk of 1st of March
>> (openmpi-1.7a1r24472.tar.bz2),
>> >> I'm testing using a small program that does the following:
>> >> - master program starts and each rank prints his hostsname
>> >> - master program spawn a slave program with the same size
>> >> - each rank of the slave (spawned) program prints his hostname
>> >> - end
>> >> Not always he is able to complete the progam run, two different
>> behaviour:
>> >> 1. not all the slave print their hostname and the program ends
>> suddenly
>> >> 2. both program ends correctly but orted demon is still alive and I
>> need to press crtl-c to exit
>> >>
>> >>
>> >> I've tryed to recompile my test program with a previous snapshot
>> (openmpi-1.7a1r22794.tar.bz2)
>> >> where I have only the compiled version of OpenMPI (in another machine).
>> >> It gives me an error before starting (I've attacehd)
>> >> Surfing on the FAQ I found some tip and I verified to compile the
>> program with the correct OpenMPI version,
>> >> that the LD_LIBRARY_PATH is consistent.
>> >> So I would like to re-compile the openmpi-1.7a1r22794.tar.bz2 but where
>> can I found it ?
>> >>
>> >>
>> >> Thank you,
>> >> Federico
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Il giorno 23 febbraio 2011 03:43, Ralph Castain <rhc.openmpi_at_[hidden]>
>> ha scritto:
>> >> Apparently not. I will investigate when I return from vacation next
>> week.
>> >>
>> >>
>> >> Sent from my iPad
>> >>
>> >> On Feb 22, 2011, at 12:42 AM, Federico Golfrè Andreasi <
>> federico.golfre_at_[hidden]> wrote:
>> >>
>> >>> Hi Ralf,
>> >>>
>> >>> I've tested spawning with the OpenMPI 1.5 release but that fix is not
>> there.
>> >>> Are you sure you've added it ?
>> >>>
>> >>> Thank you,
>> >>> Federico
>> >>>
>> >>>
>> >>>
>> >>> 2010/10/19 Ralph Castain <rhc_at_[hidden]>
>> >>> The fix should be there - just didn't get mentioned.
>> >>>
>> >>> Let me know if it isn't and I'll ensure it is in the next one...but
>> I'd be very surprised if it isn't already in there.
>> >>>
>> >>>
>> >>> On Oct 19, 2010, at 3:03 AM, Federico Golfrè Andreasi wrote:
>> >>>
>> >>>> Hi Ralf !
>> >>>>
>> >>>> I saw that the new realease 1.5 is out.
>> >>>> I didn't found this fix in the "list of changes", is it present but
>> not mentioned since is a minor fix ?
>> >>>>
>> >>>> Thank you,
>> >>>> Federico
>> >>>>
>> >>>>
>> >>>>
>> >>>> 2010/4/1 Ralph Castain <rhc_at_[hidden]>
>> >>>> Hi there!
>> >>>>
>> >>>> It will be in the 1.5.0 release, but not 1.4.2 (couldn't backport the
>> fix). I understand that will come out sometime soon, but no firm date has
>> been set.
>> >>>>
>> >>>>
>> >>>> On Apr 1, 2010, at 4:05 AM, Federico Golfrè Andreasi wrote:
>> >>>>
>> >>>>> Hi Ralph,
>> >>>>>
>> >>>>>
>> >>>>> I've downloaded and tested the openmpi-1.7a1r22817
>> snapshot,
>> >>>>> and it works fine for (multiple) spawning more than 128 processes.
>> >>>>>
>> >>>>> That fix will be included in the next release of OpenMPI, right ?
>> >>>>> Do you when it will be released ? Or where I can find that info ?
>> >>>>>
>> >>>>> Thank you,
>> >>>>> Federico
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> 2010/3/1 Ralph Castain <rhc_at_[hidden]>
>> >>>>> http://www.open-mpi.org/nightly/trunk/
>> >>>>>
>> >>>>> I'm not sure this patch will solve your problem, but it is worth a
>> try.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> users mailing list
>> >>>>> users_at_[hidden]
>> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> users mailing list
>> >>>> users_at_[hidden]
>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>>
>> >>>> _______________________________________________
>> >>>> users mailing list
>> >>>> users_at_[hidden]
>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> users mailing list
>> >>> users_at_[hidden]
>> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>
>> >>
>> >> <OpenMPI.error>
>> >
>> >
>> > <autogen.log>_______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>