Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How do I compile OpenMPI in Xcode 3.1
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-05-05 17:33:17


I agree; that is a bummer. :-(

Warner -- do you have any advice here, perchance?

On May 4, 2009, at 7:26 PM, Vicente Puig wrote:

> But it doesn't work well.
>
> For example, I am trying to debug a program, "floyd" in this case,
> and when I make a breakpoint:
>
> No line 26 in file "../../../gcc-4.2-20060805/libgfortran/fmain.c".
>
> I am getting disappointed and frustrated that I can not work well
> with openmpi in my Mac. There should be a was to make it run in
> Xcode, uff...
>
> 2009/5/4 Jeff Squyres <jsquyres_at_[hidden]>
> I get those as well. I believe that they are (annoying but)
> harmless -- an artifact of how the freeware gcc/gofrtran that I use
> was built.
>
>
>
> On May 4, 2009, at 1:47 PM, Vicente Puig wrote:
>
> Maybe I had to open a new thread, but if you have any idea why I
> receive it when I use gdb for debugging an openmpi program:
>
> warning: Could not find object file "/Users/admin/build/i386-apple-
> darwin9.0.0/libgcc/_umoddi3_s.o" - no debug information available
> for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".
>
>
> warning: Could not find object file "/Users/admin/build/i386-apple-
> darwin9.0.0/libgcc/_udiv_w_sdiv_s.o" - no debug information
> available for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".
>
>
> warning: Could not find object file "/Users/admin/build/i386-apple-
> darwin9.0.0/libgcc/_udivmoddi4_s.o" - no debug information available
> for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".
>
>
> warning: Could not find object file "/Users/admin/build/i386-apple-
> darwin9.0.0/libgcc/unwind-dw2_s.o" - no debug information available
> for "../../../gcc-4.3-20071026/libgcc/../gcc/unwind-dw2.c".
>
>
> warning: Could not find object file "/Users/admin/build/i386-apple-
> darwin9.0.0/libgcc/unwind-dw2-fde-darwin_s.o" - no debug information
> available for "../../../gcc-4.3-20071026/libgcc/../gcc/unwind-dw2-
> fde-darwin.c".
>
>
> warning: Could not find object file "/Users/admin/build/i386-apple-
> darwin9.0.0/libgcc/unwind-c_s.o" - no debug information available
> for "../../../gcc-4.3-20071026/libgcc/../gcc/unwind-c.c".
> .......
>
>
>
> There is no 'admin' so I don't know why it happen. It works well
> with a C program.
>
> Any idea??.
>
> Thanks.
>
>
> Vincent
>
>
>
>
>
> 2009/5/4 Vicente Puig <vpuibor_at_[hidden]>
> I can run openmpi perfectly with command line, but I wanted a
> graphic interface for debugging because I was having problems.
>
> Thanks anyway.
>
> Vincent
>
> 2009/5/4 Warner Yuen <wyuen_at_[hidden]>
>
> Admittedly, I don't use Xcode to build Open MPI either.
>
> You can just compile Open MPI from the command line and install
> everything in /usr/local/. Make sure that gfortran is set in your
> path and you should just be able to do a './configure --prefix=/usr/
> local'
>
> After the installation, just make sure that your path is set
> correctly when you go to use the newly installed Open MPI. If you
> don't set your path, it will always default to using the version of
> OpenMPI that ships with Leopard.
>
>
> Warner Yuen
> Scientific Computing
> Consulting Engineer
> Apple, Inc.
> email: wyuen_at_[hidden]
> Tel: 408.718.2859
>
>
>
>
> On May 4, 2009, at 9:13 AM, users-request_at_[hidden] wrote:
>
> Send users mailing list submissions to
> users_at_[hidden]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
> users-request_at_[hidden]
>
> You can reach the person managing the list at
> users-owner_at_[hidden]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
> 1. Re: How do I compile OpenMPI in Xcode 3.1 (Vicente Puig)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 4 May 2009 18:13:45 +0200
> From: Vicente Puig <vpuibor_at_[hidden]>
> Subject: Re: [OMPI users] How do I compile OpenMPI in Xcode 3.1
> To: Open MPI Users <users_at_[hidden]>
> Message-ID:
> <3e9a21680905040913u3f36d3c9rdcd3413bfdcd0c9_at_[hidden]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> If I can not make it work with Xcode, which one could I use?, which
> one do
> you use to compile and debug OpenMPI?.
> Thanks
>
> Vincent
>
>
> 2009/5/4 Jeff Squyres <jsquyres_at_[hidden]>
>
> Open MPI comes pre-installed in Leopard; as Warner noted, since
> Leopard
> doesn't ship with a Fortran compiler, the Open MPI that Apple ships
> has
> non-functional mpif77 and mpif90 wrapper compilers.
>
> So the Open MPI that you installed manually will use your Fortran
> compilers, and therefore will have functional mpif77 and mpif90
> wrapper
> compilers. Hence, you probably need to be sure to use the "right"
> wrapper
> compilers. It looks like you specified the full path specified to
> ExecPath,
> so I'm not sure why Xcode wouldn't work with that (like I mentioned, I
> unfortunately don't use Xcode myself, so I don't know why that
> wouldn't
> work).
>
>
>
>
> On May 4, 2009, at 11:53 AM, Vicente wrote:
>
> Yes, I already have gfortran compiler on /usr/local/bin, the same path
> as my mpif90 compiler. But I've seen when I use the mpif90 on /usr/bin
> and on /Developer/usr/bin says it:
>
> "Unfortunately, this installation of Open MPI was not compiled with
> Fortran 90 support. As such, the mpif90 compiler is non-functional."
>
>
> That should be the problem, I will have to change the path to use the
> gfortran I have installed.
> How could I do it? (Sorry, I am beginner)
>
> Thanks.
>
>
> El 04/05/2009, a las 17:38, Warner Yuen escribi?:
>
> Have you installed a Fortran compiler? Mac OS X's developer tools do
> not come with a Fortran compiler, so you'll need to install one if
> you haven't already done so. I routinely use the Intel IFORT
> compilers with success. However, I hear many good things about the
> gfortran compilers on Mac OS X, you can't beat the price of gfortran!
>
>
> Warner Yuen
> Scientific Computing
> Consulting Engineer
> Apple, Inc.
> email: wyuen_at_[hidden]
> Tel: 408.718.2859
>
>
>
>
> On May 4, 2009, at 7:28 AM, users-request_at_[hidden] wrote:
>
> Send users mailing list submissions to
> users_at_[hidden]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
> users-request_at_[hidden]
>
> You can reach the person managing the list at
> users-owner_at_[hidden]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
> 1. How do I compile OpenMPI in Xcode 3.1 (Vicente)
> 2. Re: 1.3.1 -rf rankfile behaviour ?? (Ralph Castain)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 4 May 2009 16:12:44 +0200
> From: Vicente <vpuibor_at_[hidden]>
> Subject: [OMPI users] How do I compile OpenMPI in Xcode 3.1
> To: users_at_[hidden]
> Message-ID: <1C2C0085-940F-43BB-910F-975871AE2F09_at_[hidden]>
> Content-Type: text/plain; charset="windows-1252"; Format="flowed";
> DelSp="yes"
>
> Hi, I've seen the FAQ "How do I use Open MPI wrapper compilers in
> Xcode", but it's only for MPICC. I am using MPIF90, so I did the
> same,
> but changing MPICC for MPIF90, and also the path, but it did not
> work.
>
> Building target ?fortran? of project ?fortran? with configuration
> ?Debug?
>
>
> Checking Dependencies
> Invalid value 'MPIF90' for GCC_VERSION
>
>
> The file "MPIF90.cpcompspec" looks like this:
>
> 1 /**
> 2 Xcode Coompiler Specification for MPIF90
> 3
> 4 */
> 5
> 6 { Type = Compiler;
> 7 Identifier = com.apple.compilers.mpif90;
> 8 BasedOn = com.apple.compilers.gcc.4_0;
> 9 Name = "MPIF90";
> 10 Version = "Default";
> 11 Description = "MPI GNU C/C++ Compiler 4.0";
> 12 ExecPath = "/usr/local/bin/mpif90"; // This gets
> converted to the g++ variant automatically
> 13 PrecompStyle = pch;
> 14 }
>
> and is located in "/Developer/Library/Xcode/Plug-ins"
>
> and when I do mpif90 -v on terminal it works well:
>
> Using built-in specs.
> Target: i386-apple-darwin8.10.1
> Configured with: /tmp/gfortran-20090321/ibin/../gcc/configure --
> prefix=/usr/local/gfortran --enable-languages=c,fortran --with-gmp=/
> tmp/gfortran-20090321/gfortran_libs --enable-bootstrap
> Thread model: posix
> gcc version 4.4.0 20090321 (experimental) [trunk revision 144983]
> (GCC)
>
>
> Any idea??
>
> Thanks.
>
> Vincent
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> Message: 2
> Date: Mon, 4 May 2009 08:28:26 -0600
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
> To: Open MPI Users <users_at_[hidden]>
> Message-ID:
> <71d2d8cc0905040728h2002f4d7s4c49219eee29e86f_at_[hidden]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Unfortunately, I didn't write any of that code - I was just fixing
> the
> mapper so it would properly map the procs. From what I can tell,
> the proper
> things are happening there.
>
> I'll have to dig into the code that specifically deals with parsing
> the
> results to bind the processes. Afraid that will take awhile longer
> - pretty
> dark in that hole.
>
>
> On Mon, May 4, 2009 at 8:04 AM, Geoffroy Pignot
> <geopignot_at_[hidden]> wrote:
>
> Hi,
>
> So, there are no more crashes with my "crazy" mpirun command. But
> the
> paffinity feature seems to be broken. Indeed I am not able to pin my
> processes.
>
> Simple test with a program using your plpa library :
>
> r011n006% cat hostf
> r011n006 slots=4
>
> r011n006% cat rankf
> rank 0=r011n006 slot=0 ----> bind to CPU 0 , exact ?
>
> r011n006% /tmp/HALMPI/openmpi-1.4a/bin/mpirun --hostfile hostf --
> rankfile
> rankf --wdir /tmp -n 1 a.out
> PLPA Number of processors online: 4
> PLPA Number of processor sockets: 2
> PLPA Socket 0 (ID 0): 2 cores
> PLPA Socket 1 (ID 3): 2 cores
>
> Ctrl+Z
> r011n006%bg
>
> r011n006% ps axo stat,user,psr,pid,pcpu,comm | grep gpignot
> R+ gpignot 3 9271 97.8 a.out
>
> In fact whatever the slot number I put in my rankfile , a.out
> always runs
> on the CPU 3. I was looking for it on CPU 0 accordind to my
> cpuinfo file
> (see below)
> The result is the same if I try another syntax (rank 0=r011n006
> slot=0:0
> bind to socket 0 - core 0 , exact ? )
>
> Thanks in advance
>
> Geoffroy
>
> PS: I run on rhel5
>
> r011n006% uname -a
> Linux r011n006 2.6.18-92.1.1NOMAP32.el5 #1 SMP Sat Mar 15 01:46:39
> CDT 2008
> x86_64 x86_64 x86_64 GNU/Linux
>
> My configure is :
> ./configure --prefix=/tmp/openmpi-1.4a --libdir='${exec_prefix}/
> lib64'
> --disable-dlopen --disable-mpi-cxx --enable-heterogeneous
>
>
> r011n006% cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
> stepping : 6
> cpu MHz : 2660.007
> cache size : 4096 KB
> physical id : 0
> siblings : 2
> core id : 0
> cpu cores : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm
> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
> bogomips : 5323.68
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> processor : 1
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
> stepping : 6
> cpu MHz : 2660.007
> cache size : 4096 KB
> physical id : 3
> siblings : 2
> core id : 0
> cpu cores : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm
> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
> bogomips : 5320.03
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> processor : 2
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
> stepping : 6
> cpu MHz : 2660.007
> cache size : 4096 KB
> physical id : 0
> siblings : 2
> core id : 1
> cpu cores : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm
> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
> bogomips : 5319.39
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> processor : 3
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
> stepping : 6
> cpu MHz : 2660.007
> cache size : 4096 KB
> physical id : 3
> siblings : 2
> core id : 1
> cpu cores : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm
> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
> bogomips : 5320.03
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 4 May 2009 04:45:57 -0600
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <D01D7B16-4B47-46F3-AD41-D1A90B2E4927_at_[hidden]>
>
> Content-Type: text/plain; charset="us-ascii"; Format="flowed";
> DelSp="yes"
>
> My apologies - I wasn't clear enough. You need a tarball from
> r21111
> or greater...such as:
>
> http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r21142.tar.gz
>
> HTH
> Ralph
>
>
> On May 4, 2009, at 2:14 AM, Geoffroy Pignot wrote:
>
> Hi ,
>
> I got the openmpi-1.4a1r21095.tar.gz tarball, but unfortunately my
> command doesn't work
>
> cat rankf:
> rank 0=node1 slot=*
> rank 1=node2 slot=*
>
> cat hostf:
> node1 slots=2
> node2 slots=2
>
> mpirun --rankfile rankf --hostfile hostf --host node1 -n 1
> hostname : --host node2 -n 1 hostname
>
> Error, invalid rank (1) in the rankfile (rankf)
>
>
>
> --------------------------------------------------------------------------
> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> rmaps_rank_file.c at line 403
> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> base/rmaps_base_map_job.c at line 86
> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> base/plm_base_launch_support.c at line 86
> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> plm_rsh_module.c at line 1016
>
>
> Ralph, could you tell me if my command syntax is correct or
> not ? if
> not, give me the expected one ?
>
> Regards
>
> Geoffroy
>
>
>
>
> 2009/4/30 Geoffroy Pignot <geopignot_at_[hidden]>
> Immediately Sir !!! :)
>
> Thanks again Ralph
>
> Geoffroy
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 30 Apr 2009 06:45:39 -0600
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
> To: Open MPI Users <users_at_[hidden]>
> Message-ID:
> <71d2d8cc0904300545v61a42fe1k50086d2704d0f7e6_at_[hidden]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> I believe this is fixed now in our development trunk - you can
> download any
> tarball starting from last night and give it a try, if you like.
> Any
> feedback would be appreciated.
>
> Ralph
>
>
> On Apr 14, 2009, at 7:57 AM, Ralph Castain wrote:
>
> Ah now, I didn't say it -worked-, did I? :-)
>
> Clearly a bug exists in the program. I'll try to take a look at it
> (if Lenny
> doesn't get to it first), but it won't be until later in the week.
>
> On Apr 14, 2009, at 7:18 AM, Geoffroy Pignot wrote:
>
> I agree with you Ralph , and that 's what I expect from openmpi
> but my
> second example shows that it's not working
>
> cat hostfile.0
> r011n002 slots=4
> r011n003 slots=4
>
> cat rankfile.0
> rank 0=r011n002 slot=0
> rank 1=r011n003 slot=1
>
> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
> hostname
> ### CRASHED
>
> Error, invalid rank (1) in the rankfile (rankfile.0)
>
>
>
>
> --------------------------------------------------------------------------
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> rmaps_rank_file.c at line 404
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> base/rmaps_base_map_job.c at line 87
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> base/plm_base_launch_support.c at line 77
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> plm_rsh_module.c at line 985
>
>
>
>
> --------------------------------------------------------------------------
> A daemon (pid unknown) died unexpectedly on signal 1 while
> attempting to
> launch so we are aborting.
>
> There may be more information reported by the environment (see
> above).
>
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH
> to
> have the
> location of the shared libraries on the remote nodes and this
> will
> automatically be forwarded to the remote nodes.
>
>
>
>
> --------------------------------------------------------------------------
>
>
>
>
> --------------------------------------------------------------------------
> orterun noticed that the job aborted, but has no info as to the
> process
> that caused that situation.
>
>
>
>
> --------------------------------------------------------------------------
> orterun: clean termination accomplished
>
>
>
> Message: 4
> Date: Tue, 14 Apr 2009 06:55:58 -0600
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <F6290ADA-A196-43F0-A853-CBCB802D8D9C_at_[hidden]>
> Content-Type: text/plain; charset="us-ascii"; Format="flowed";
> DelSp="yes"
>
> The rankfile cuts across the entire job - it isn't applied on an
> app_context basis. So the ranks in your rankfile must correspond
> to
> the eventual rank of each process in the cmd line.
>
> Unfortunately, that means you have to count ranks. In your case,
> you
> only have four, so that makes life easier. Your rankfile would
> look
> something like this:
>
> rank 0=r001n001 slot=0
> rank 1=r001n002 slot=1
> rank 2=r001n001 slot=1
> rank 3=r001n002 slot=2
>
> HTH
> Ralph
>
> On Apr 14, 2009, at 12:19 AM, Geoffroy Pignot wrote:
>
> Hi,
>
> I agree that my examples are not very clear. What I want to do
> is to
> launch a multiexes application (masters-slaves) and benefit
> from the
> processor affinity.
> Could you show me how to convert this command , using -rf option
> (whatever the affinity is)
>
> mpirun -n 1 -host r001n001 master.x options1 : -n 1 -host
> r001n002
> master.x options2 : -n 1 -host r001n001 slave.x options3 : -n 1 -
> host r001n002 slave.x options4
>
> Thanks for your help
>
> Geoffroy
>
>
>
>
>
> Message: 2
> Date: Sun, 12 Apr 2009 18:26:35 +0300
> From: Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]>
> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
> To: Open MPI Users <users_at_[hidden]>
> Message-ID:
>
> <453d39990904120826t2e1d1d33l7bb1fe3de65b5361_at_[hidden]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
>
> The first "crash" is OK, since your rankfile has ranks 0 and 1
> defined,
> while n=1, which means only rank 0 is present and can be
> allocated.
>
> NP must be >= the largest rank in rankfile.
>
> What exactly are you trying to do ?
>
> I tried to recreate your seqv but all I got was
>
> ~/work/svn/ompi/trunk/build_x86-64/install/bin/mpirun --hostfile
> hostfile.0
> -rf rankfile.0 -n 1 hostname : -rf rankfile.1 -n 1 hostname
> [witch19:30798] mca: base: component_find: paffinity
> "mca_paffinity_linux"
> uses an MCA interface that is not recognized (component MCA
> v1.0.0 !=
> supported MCA v2.0.0) -- ignored
>
>
>
> --------------------------------------------------------------------------
> It looks like opal_init failed for some reason; your parallel
> process is
> likely to abort. There are many reasons that a parallel process
> can
> fail during opal_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal
> failure;
> here's some additional information (which may only be relevant
> to an
> Open MPI developer):
>
> opal_carto_base_select failed
> --> Returned value -13 instead of OPAL_SUCCESS
>
>
>
> --------------------------------------------------------------------------
> [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
> file
> ../../orte/runtime/orte_init.c at line 78
> [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
> file
> ../../orte/orted/orted_main.c at line 344
>
>
>
> --------------------------------------------------------------------------
> A daemon (pid 11629) died unexpectedly with status 243 while
> attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see
> above).
>
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> have the
> location of the shared libraries on the remote nodes and this
> will
> automatically be forwarded to the remote nodes.
>
>
>
> --------------------------------------------------------------------------
>
>
>
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the
> process
> that caused that situation.
>
>
>
> --------------------------------------------------------------------------
> mpirun: clean termination accomplished
>
>
> Lenny.
>
>
> On 4/10/09, Geoffroy Pignot <geopignot_at_[hidden]> wrote:
>
> Hi ,
>
> I am currently testing the process affinity capabilities of
> openmpi and I
> would like to know if the rankfile behaviour I will describe
> below
> is normal
> or not ?
>
> cat hostfile.0
> r011n002 slots=4
> r011n003 slots=4
>
> cat rankfile.0
> rank 0=r011n002 slot=0
> rank 1=r011n003 slot=1
>
>
>
>
>
>
> ##################################################################################
>
> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 2 hostname ###
> OK
> r011n002
> r011n003
>
>
>
>
>
>
> ##################################################################################
> but
> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
> hostname
> ### CRASHED
> *
>
>
>
>
> --------------------------------------------------------------------------
> Error, invalid rank (1) in the rankfile (rankfile.0)
>
>
>
>
> --------------------------------------------------------------------------
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> rmaps_rank_file.c at line 404
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> base/rmaps_base_map_job.c at line 87
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> base/plm_base_launch_support.c at line 77
> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
> file
> plm_rsh_module.c at line 985
>
>
>
>
> --------------------------------------------------------------------------
> A daemon (pid unknown) died unexpectedly on signal 1 while
> attempting to
> launch so we are aborting.
>
> There may be more information reported by the environment (see
> above).
>
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH
> to
> have the
> location of the shared libraries on the remote nodes and this
> will
> automatically be forwarded to the remote nodes.
>
>
>
>
> --------------------------------------------------------------------------
>
>
>
>
> --------------------------------------------------------------------------
> orterun noticed that the job aborted, but has no info as to the
> process
> that caused that situation.
>
>
>
>
> --------------------------------------------------------------------------
> orterun: clean termination accomplished
> *
> It seems that the rankfile option is not propagted to the second
> command
> line ; there is no global understanding of the ranking inside a
> mpirun
> command.
>
>
>
>
>
>
> ##################################################################################
>
> Assuming that , I tried to provide a rankfile to each command
> line:
>
> cat rankfile.0
> rank 0=r011n002 slot=0
>
> cat rankfile.1
> rank 0=r011n003 slot=1
>
> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -rf
> rankfile.1
> -n 1 hostname ### CRASHED
> *[r011n002:28778] *** Process received signal ***
> [r011n002:28778] Signal: Segmentation fault (11)
> [r011n002:28778] Signal code: Address not mapped (1)
> [r011n002:28778] Failing at address: 0x34
> [r011n002:28778] [ 0] [0xffffe600]
> [r011n002:28778] [ 1]
> /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
> 0(orte_odls_base_default_get_add_procs_data+0x55d)
> [0x5557decd]
> [r011n002:28778] [ 2]
> /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
> 0(orte_plm_base_launch_apps+0x117)
> [0x555842a7]
> [r011n002:28778] [ 3] /tmp/HALMPI/openmpi-1.3.1/lib/openmpi/
> mca_plm_rsh.so
> [0x556098c0]
> [r011n002:28778] [ 4] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
> [0x804aa27]
> [r011n002:28778] [ 5] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
> [0x804a022]
> [r011n002:28778] [ 6] /lib/libc.so.6(__libc_start_main+0xdc)
> [0x9f1dec]
> [r011n002:28778] [ 7] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
> [0x8049f71]
> [r011n002:28778] *** End of error message ***
> Segmentation fault (core dumped)*
>
>
>
> I hope that I've found a bug because it would be very important
> for me to
> have this kind of capabiliy .
> Launch a multiexe mpirun command line and be able to bind my
> exes
> and
> sockets together.
>
> Thanks in advance for your help
>
> Geoffroy
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1202, Issue 2
> **************************************
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1218, Issue 2
> **************************************
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1221, Issue 3
> **************************************
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1221, Issue 6
> **************************************
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1221, Issue 12
> ***************************************
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems