Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 1.3 hangs running 2 exes with different names
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-01-22 11:47:35


I can't replicate that behavior - it all seems to be working just
fine. I can launch apps of different name, we correctly detect and
respond to missing executables, etc.

Can you provide more info as to how this was built? Also, be sure to
check that the remote hosts are using the same version of OMPI - hangs
are typically a good indicator that the remote node is picking up a
different OMPI version.

Ralph

On Jan 22, 2009, at 5:42 AM, Geoffroy Pignot wrote:

> Hello , still a bug ???
>
> compil03% /tmp/openmpi-1.3/bin/mpirun -n 1 --wdir /tmp --host
> compil03 a.out : -n 1 --host compil02 a.out
> Hello world from process 0 of 2
> Hello world from process 1 of 2
>
> compil03% mv a.out a.out_32
> compil03% /tmp/openmpi-1.3/bin/mpirun -n 1 --wdir /tmp --host
> compil03 a.out_32 : -n 1 --host compil02 a.out
> HANGS
>
> Thanks in advance for your expertise
>
> Geoffroy
>
>
>
>
>
> 2009/1/22 <users-request_at_[hidden]>
> Send users mailing list submissions to
> users_at_[hidden]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
> users-request_at_[hidden]
>
> You can reach the person managing the list at
> users-owner_at_[hidden]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
> 1. One additional (unwanted) process when using dynamical
> process management (Evgeniy Gromov)
> 2. Re: One additional (unwanted) process when using dynamical
> process management (Ralph Castain)
> 3. Re: One additional (unwanted) process when using dynamical
> process management (Evgeniy Gromov)
> 4. Re: One additional (unwanted) process when using
> dynamical
> process management (Ralph Castain)
> 5. Re: openmpi 1.3 and --wdir problem (Ralph Castain)
> 6. Re: Problem compiling open mpi 1.3 with sunstudio12 express
> (Jeff Squyres)
> 7. Handling output of processes (jody)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 21 Jan 2009 19:02:48 +0100
> From: Evgeniy Gromov <Evgeniy.Gromov_at_[hidden]>
> Subject: [OMPI users] One additional (unwanted) process when using
> dynamical process management
> To: users_at_[hidden]
> Message-ID: <49776348.8000900_at_[hidden]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Dear OpenMPI users,
>
> I have the following (problem) related to OpenMPI:
> I have recently compiled with OPenMPI the new (4-1)
> Global Arrays package using ARMCI_NETWORK=MPI-SPAWN,
> which implies the use of dynamic process management
> realised in MPI2. It got compiled and tested successfully.
> However when it is spawning on different nodes (machine) one
> additional process on each node appears, i.e. if nodes=2:ppn=2
> then on each node there are 3 running processes. In the case
> when it runs just on one pc with a few cores (let say nodes=1:ppn=4),
> the number of processes exactly equals the number of cpus (ppn)
> requested and there is no additional process.
> I am wondering whether it is normal behavior. Thanks!
>
> Best regards,
> Evgeniy
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 21 Jan 2009 11:15:00 -0700
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] One additional (unwanted) process when using
> dynamical process management
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <4CCBD3F8-937F-4F8B-B953-F9CF9DD45EF5_at_[hidden]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> Not that I've seen. What version of OMPI are you using, and on what
> type of machine/environment?
>
>
> On Jan 21, 2009, at 11:02 AM, Evgeniy Gromov wrote:
>
> > Dear OpenMPI users,
> >
> > I have the following (problem) related to OpenMPI:
> > I have recently compiled with OPenMPI the new (4-1)
> > Global Arrays package using ARMCI_NETWORK=MPI-SPAWN,
> > which implies the use of dynamic process management
> > realised in MPI2. It got compiled and tested successfully.
> > However when it is spawning on different nodes (machine) one
> > additional process on each node appears, i.e. if nodes=2:ppn=2
> > then on each node there are 3 running processes. In the case
> > when it runs just on one pc with a few cores (let say
> nodes=1:ppn=4),
> > the number of processes exactly equals the number of cpus (ppn)
> > requested and there is no additional process.
> > I am wondering whether it is normal behavior. Thanks!
> >
> > Best regards,
> > Evgeniy
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 21 Jan 2009 19:30:27 +0100
> From: Evgeniy Gromov <Evgeniy.Gromov_at_[hidden]>
> Subject: Re: [OMPI users] One additional (unwanted) process when using
> dynamical process management
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <497769C3.8070201_at_[hidden]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Dear Ralph,
>
> Thanks for your reply.
> I encountered this problem using openmpi-1.2.5,
> on a Opteron cluster with Myrinet-mx. I tried for
> compilation of Global Arrays different compilers
> (gfortran, intel, pathscale), the result is the same.
>
> As I mentioned in the previous message GA itself works
> fine, but the application which uses it doesn't work
> correctly if it runs on several nodes. If it runs on
> one node with several cores everything is fine. So I
> thought that the problem might be in this additional
> process.
>
> Should I try to use the latest 1.3 version of openmpi?
>
> Best,
> Evgeniy
>
> Ralph Castain wrote:
> > Not that I've seen. What version of OMPI are you using, and on
> what type
> > of machine/environment?
> >
> >
> > On Jan 21, 2009, at 11:02 AM, Evgeniy Gromov wrote:
> >
> >> Dear OpenMPI users,
> >>
> >> I have the following (problem) related to OpenMPI:
> >> I have recently compiled with OPenMPI the new (4-1)
> >> Global Arrays package using ARMCI_NETWORK=MPI-SPAWN,
> >> which implies the use of dynamic process management
> >> realised in MPI2. It got compiled and tested successfully.
> >> However when it is spawning on different nodes (machine) one
> >> additional process on each node appears, i.e. if nodes=2:ppn=2
> >> then on each node there are 3 running processes. In the case
> >> when it runs just on one pc with a few cores (let say
> nodes=1:ppn=4),
> >> the number of processes exactly equals the number of cpus (ppn)
> >> requested and there is no additional process.
> >> I am wondering whether it is normal behavior. Thanks!
> >>
> >> Best regards,
> >> Evgeniy
> >>
> >>
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
> --
> _______________________________________
> Dr. Evgeniy Gromov
> Theoretische Chemie
> Physikalisch-Chemisches Institut
> Im Neuenheimer Feld 229
> D-69120 Heidelberg
> Germany
>
> Telefon: +49/(0)6221/545263
> Fax: +49/(0)6221/545221
> E-mail: evgeniy_at_[hidden]
> _______________________________________
>
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 21 Jan 2009 11:38:48 -0700
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] One additional (unwanted) process
> when using
> dynamical process management
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <75C59577-D1EA-422B-A0B9-7F1C28E8D4CF_at_[hidden]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> If you can, 1.3 would certainly be a good step to take. I'm not sure
> why 1.2.5 would be behaving this way, though, so it may indeed be
> something in the application (perhaps in the info key being passed to
> us?) that is the root cause.
>
> Still, if it isn't too much trouble, moving to 1.3 will provide you
> with a better platform for dynamic process management regardless.
>
> Ralph
>
>
> On Jan 21, 2009, at 11:30 AM, Evgeniy Gromov wrote:
>
> > Dear Ralph,
> >
> > Thanks for your reply.
> > I encountered this problem using openmpi-1.2.5,
> > on a Opteron cluster with Myrinet-mx. I tried for
> > compilation of Global Arrays different compilers
> > (gfortran, intel, pathscale), the result is the same.
> >
> > As I mentioned in the previous message GA itself works
> > fine, but the application which uses it doesn't work
> > correctly if it runs on several nodes. If it runs on
> > one node with several cores everything is fine. So I
> > thought that the problem might be in this additional
> > process.
> >
> > Should I try to use the latest 1.3 version of openmpi?
> >
> > Best,
> > Evgeniy
> >
> > Ralph Castain wrote:
> >> Not that I've seen. What version of OMPI are you using, and on what
> >> type of machine/environment?
> >> On Jan 21, 2009, at 11:02 AM, Evgeniy Gromov wrote:
> >>> Dear OpenMPI users,
> >>>
> >>> I have the following (problem) related to OpenMPI:
> >>> I have recently compiled with OPenMPI the new (4-1)
> >>> Global Arrays package using ARMCI_NETWORK=MPI-SPAWN,
> >>> which implies the use of dynamic process management
> >>> realised in MPI2. It got compiled and tested successfully.
> >>> However when it is spawning on different nodes (machine) one
> >>> additional process on each node appears, i.e. if nodes=2:ppn=2
> >>> then on each node there are 3 running processes. In the case
> >>> when it runs just on one pc with a few cores (let say
> >>> nodes=1:ppn=4),
> >>> the number of processes exactly equals the number of cpus (ppn)
> >>> requested and there is no additional process.
> >>> I am wondering whether it is normal behavior. Thanks!
> >>>
> >>> Best regards,
> >>> Evgeniy
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > _______________________________________
> > Dr. Evgeniy Gromov
> > Theoretische Chemie
> > Physikalisch-Chemisches Institut
> > Im Neuenheimer Feld 229
> > D-69120 Heidelberg
> > Germany
> >
> > Telefon: +49/(0)6221/545263
> > Fax: +49/(0)6221/545221
> > E-mail: evgeniy_at_[hidden]
> > _______________________________________
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 21 Jan 2009 11:40:28 -0700
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] openmpi 1.3 and --wdir problem
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <B57EA438-1C8A-467C-B791-96EABE6031F4_at_[hidden]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> This is now fixed in the trunk and will be in the 1.3.1 release.
>
> Thanks again for the heads-up!
> Ralph
>
> On Jan 21, 2009, at 8:45 AM, Ralph Castain wrote:
>
> > You are correct - that is a bug in 1.3.0. I'm working on a fix for
> > it now and will report back.
> >
> > Thanks for catching it!
> > Ralph
> >
> >
> > On Jan 21, 2009, at 3:22 AM, Geoffroy Pignot wrote:
> >
> >> Hello
> >>
> >> I'm currently trying the new release but I cant reproduce the
> >> 1.2.8 behaviour
> >> concerning --wdir option
> >>
> >> Then
> >> %% /tmp/openmpi-1.2.8/bin/mpirun -n 1 --wdir /tmp --host r003n030
> >> pwd : --wdir /scr1 -n 1 --host r003n031 pwd
> >> /scr1
> >> /tmp
> >>
> >> but
> >> %% /tmp/openmpi-1.3/bin/mpirun -n 1 --wdir /tmp --host r003n030
> >> pwd : --wdir /scr1 -n 1 --host r003n031 pwd
> >> /scr1
> >> /scr1
> >> Thanks in advance
> >> Regards
> >> Geoffroy
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 21 Jan 2009 14:06:42 -0500
> From: Jeff Squyres <jsquyres_at_[hidden]>
> Subject: Re: [OMPI users] Problem compiling open mpi 1.3 with
> sunstudio12 express
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <36FCDF58-9138-46A9-A432-CDF2A99A1CD7_at_[hidden]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> FWIW, I have run with my LD_LIBRARY_PATH set to a combination of
> multiple OMPI installations; it ended up using the leftmost entry in
> the LD_LIBRARY_PATH (as I intended). I'm not quite sure why it
> wouldn't do that for you. :-(
>
>
> On Jan 21, 2009, at 4:53 AM, Olivier Marsden wrote:
>
> >
> >>
> >> - Check that /opt/mpi_sun and /opt/mpi_gfortran* are actually
> >> distinct subdirectories; there's no hidden sym/hard links in there
> >> somewhere (where directories and/or individual files might
> >> accidentally be pointing to the other tree)
> >>
> >
> > no hidden links in the directories
> >
> >> - does "env | grep mpi_" show anything interesting / revealing?
> >> What is your LD_LIBRARY_PATH set to?
> >>
> > Nothing in env | grep mpi, and for the purposes of building,
> > LD_LIBRARY_PATH is set to
> > /opt/sun/express/sunstudioceres/lib/:/opt/mpi_sun/lib:xxx
> > where xxx is, among other things, the other mpi installations.
> > This lead me to find a problem, but which seems to be more related
> > to my linux configuration than openmpi:
> > I tried redefining ld_library_path to point just to sun, and
> > everything works correctly.
> > Putting my previous paths back into the variable leads to erroneous
> > behaviour, with ldd indicating that mpif90
> > is linked to libraries in the gfortran tree.
> > I thought that ld looked for libraries in folders in the order that
> > the folders are given in ld_library_path, and so
> > having mpi_sun as the first folder would suffice for its libraries
> > to be used; is that where I was wrong?
> > Sorry for the trouble, in any case redefining the ld_library_path to
> > remove all references to other installations works.
> > Looks like I'll have to swot up on my linker configuration
> knowledge!
> > Thanks alot for your time,
> >
> > Olivier Marsden
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
>
> ------------------------------
>
> Message: 7
> Date: Thu, 22 Jan 2009 09:58:22 +0100
> From: jody <jody.xha_at_[hidden]>
> Subject: [OMPI users] Handling output of processes
> To: Open MPI Users <users_at_[hidden]>
> Message-ID:
> <9b0da5ce0901220058n6e534224i78a6daf6b0afc209_at_[hidden]>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi
> I have a small cluster consisting of 9 computers (8x2 CPUs, 1x4 CPUs).
> I would like to be able to observe the output of the processes
> separately during an mpirun.
>
> What i currently do is to apply the mpirun to a shell script which
> opens a xterm for each process,
> which then starts the actual application.
>
> This works, but is a bit complicated, e.g. finding the window you're
> interested in among 19 others.
>
> So i was wondering is there a possibility to capture the processes'
> outputs separately, so
> i can make an application in which i can switch between the different
> processor outputs?
> I could imagine that could be done by wrapper applications which
> redirect the output over a TCP
> socket to a server application.
>
> But perhaps there is an easier way, or something like this alread
> does exist?
>
> Thank You
> Jody
>
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1126, Issue 1
> **************************************
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users