Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] How do I compile OpenMPI in Xcode 3.1
From: Warner Yuen (wyuen_at_[hidden])
Date: 2009-05-04 11:38:37


Have you installed a Fortran compiler? Mac OS X's developer tools do
not come with a Fortran compiler, so you'll need to install one if you
haven't already done so. I routinely use the Intel IFORT compilers
with success. However, I hear many good things about the gfortran
compilers on Mac OS X, you can't beat the price of gfortran!

Warner Yuen
Scientific Computing
Consulting Engineer
Apple, Inc.
email: wyuen_at_[hidden]
Tel: 408.718.2859

On May 4, 2009, at 7:28 AM, users-request_at_[hidden] wrote:

> Send users mailing list submissions to
> users_at_[hidden]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
> users-request_at_[hidden]
>
> You can reach the person managing the list at
> users-owner_at_[hidden]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
> 1. How do I compile OpenMPI in Xcode 3.1 (Vicente)
> 2. Re: 1.3.1 -rf rankfile behaviour ?? (Ralph Castain)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 4 May 2009 16:12:44 +0200
> From: Vicente <vpuibor_at_[hidden]>
> Subject: [OMPI users] How do I compile OpenMPI in Xcode 3.1
> To: users_at_[hidden]
> Message-ID: <1C2C0085-940F-43BB-910F-975871AE2F09_at_[hidden]>
> Content-Type: text/plain; charset="windows-1252"; Format="flowed";
> DelSp="yes"
>
> Hi, I've seen the FAQ "How do I use Open MPI wrapper compilers in
> Xcode", but it's only for MPICC. I am using MPIF90, so I did the same,
> but changing MPICC for MPIF90, and also the path, but it did not work.
>
> Building target ?fortran? of project ?fortran? with configuration
> ?Debug?
>
>
> Checking Dependencies
> Invalid value 'MPIF90' for GCC_VERSION
>
>
> The file "MPIF90.cpcompspec" looks like this:
>
> 1 /**
> 2 Xcode Coompiler Specification for MPIF90
> 3
> 4 */
> 5
> 6 { Type = Compiler;
> 7 Identifier = com.apple.compilers.mpif90;
> 8 BasedOn = com.apple.compilers.gcc.4_0;
> 9 Name = "MPIF90";
> 10 Version = "Default";
> 11 Description = "MPI GNU C/C++ Compiler 4.0";
> 12 ExecPath = "/usr/local/bin/mpif90"; // This gets
> converted to the g++ variant automatically
> 13 PrecompStyle = pch;
> 14 }
>
> and is located in "/Developer/Library/Xcode/Plug-ins"
>
> and when I do mpif90 -v on terminal it works well:
>
> Using built-in specs.
> Target: i386-apple-darwin8.10.1
> Configured with: /tmp/gfortran-20090321/ibin/../gcc/configure --
> prefix=/usr/local/gfortran --enable-languages=c,fortran --with-gmp=/
> tmp/gfortran-20090321/gfortran_libs --enable-bootstrap
> Thread model: posix
> gcc version 4.4.0 20090321 (experimental) [trunk revision 144983]
> (GCC)
>
>
> Any idea??
>
> Thanks.
>
> Vincent
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> Message: 2
> Date: Mon, 4 May 2009 08:28:26 -0600
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
> To: Open MPI Users <users_at_[hidden]>
> Message-ID:
> <71d2d8cc0905040728h2002f4d7s4c49219eee29e86f_at_[hidden]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Unfortunately, I didn't write any of that code - I was just fixing the
> mapper so it would properly map the procs. From what I can tell, the
> proper
> things are happening there.
>
> I'll have to dig into the code that specifically deals with parsing
> the
> results to bind the processes. Afraid that will take awhile longer -
> pretty
> dark in that hole.
>
>
> On Mon, May 4, 2009 at 8:04 AM, Geoffroy Pignot
> <geopignot_at_[hidden]> wrote:
>
>> Hi,
>>
>> So, there are no more crashes with my "crazy" mpirun command. But the
>> paffinity feature seems to be broken. Indeed I am not able to pin my
>> processes.
>>
>> Simple test with a program using your plpa library :
>>
>> r011n006% cat hostf
>> r011n006 slots=4
>>
>> r011n006% cat rankf
>> rank 0=r011n006 slot=0 ----> bind to CPU 0 , exact ?
>>
>> r011n006% /tmp/HALMPI/openmpi-1.4a/bin/mpirun --hostfile hostf --
>> rankfile
>> rankf --wdir /tmp -n 1 a.out
>>>>> PLPA Number of processors online: 4
>>>>> PLPA Number of processor sockets: 2
>>>>> PLPA Socket 0 (ID 0): 2 cores
>>>>> PLPA Socket 1 (ID 3): 2 cores
>>
>> Ctrl+Z
>> r011n006%bg
>>
>> r011n006% ps axo stat,user,psr,pid,pcpu,comm | grep gpignot
>> R+ gpignot 3 9271 97.8 a.out
>>
>> In fact whatever the slot number I put in my rankfile , a.out
>> always runs
>> on the CPU 3. I was looking for it on CPU 0 accordind to my cpuinfo
>> file
>> (see below)
>> The result is the same if I try another syntax (rank 0=r011n006
>> slot=0:0
>> bind to socket 0 - core 0 , exact ? )
>>
>> Thanks in advance
>>
>> Geoffroy
>>
>> PS: I run on rhel5
>>
>> r011n006% uname -a
>> Linux r011n006 2.6.18-92.1.1NOMAP32.el5 #1 SMP Sat Mar 15 01:46:39
>> CDT 2008
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> My configure is :
>> ./configure --prefix=/tmp/openmpi-1.4a --libdir='${exec_prefix}/
>> lib64'
>> --disable-dlopen --disable-mpi-cxx --enable-heterogeneous
>>
>>
>> r011n006% cat /proc/cpuinfo
>> processor : 0
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 0
>> siblings : 2
>> core id : 0
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
>> nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5323.68
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>> processor : 1
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 3
>> siblings : 2
>> core id : 0
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
>> nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5320.03
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>> processor : 2
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 0
>> siblings : 2
>> core id : 1
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
>> nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5319.39
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>> processor : 3
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 3
>> siblings : 2
>> core id : 1
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
>> nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5320.03
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>>
>>> ------------------------------
>>>
>>> Message: 2
>>> Date: Mon, 4 May 2009 04:45:57 -0600
>>> From: Ralph Castain <rhc_at_[hidden]>
>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>> To: Open MPI Users <users_at_[hidden]>
>>> Message-ID: <D01D7B16-4B47-46F3-AD41-D1A90B2E4927_at_[hidden]>
>>>
>>> Content-Type: text/plain; charset="us-ascii"; Format="flowed";
>>> DelSp="yes"
>>>
>>> My apologies - I wasn't clear enough. You need a tarball from r21111
>>> or greater...such as:
>>>
>>> http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r21142.tar.gz
>>>
>>> HTH
>>> Ralph
>>>
>>>
>>> On May 4, 2009, at 2:14 AM, Geoffroy Pignot wrote:
>>>
>>>> Hi ,
>>>>
>>>> I got the openmpi-1.4a1r21095.tar.gz tarball, but unfortunately my
>>>> command doesn't work
>>>>
>>>> cat rankf:
>>>> rank 0=node1 slot=*
>>>> rank 1=node2 slot=*
>>>>
>>>> cat hostf:
>>>> node1 slots=2
>>>> node2 slots=2
>>>>
>>>> mpirun --rankfile rankf --hostfile hostf --host node1 -n 1
>>>> hostname : --host node2 -n 1 hostname
>>>>
>>>> Error, invalid rank (1) in the rankfile (rankf)
>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>> rmaps_rank_file.c at line 403
>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>> base/rmaps_base_map_job.c at line 86
>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>> base/plm_base_launch_support.c at line 86
>>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>> plm_rsh_module.c at line 1016
>>>>
>>>>
>>>> Ralph, could you tell me if my command syntax is correct or not ?
>>>> if
>>>> not, give me the expected one ?
>>>>
>>>> Regards
>>>>
>>>> Geoffroy
>>>>
>>>>
>>>>
>>>>
>>>> 2009/4/30 Geoffroy Pignot <geopignot_at_[hidden]>
>>>> Immediately Sir !!! :)
>>>>
>>>> Thanks again Ralph
>>>>
>>>> Geoffroy
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Message: 2
>>>> Date: Thu, 30 Apr 2009 06:45:39 -0600
>>>> From: Ralph Castain <rhc_at_[hidden]>
>>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>>> To: Open MPI Users <users_at_[hidden]>
>>>> Message-ID:
>>>> <71d2d8cc0904300545v61a42fe1k50086d2704d0f7e6_at_[hidden]>
>>>> Content-Type: text/plain; charset="iso-8859-1"
>>>>
>>>> I believe this is fixed now in our development trunk - you can
>>>> download any
>>>> tarball starting from last night and give it a try, if you like.
>>>> Any
>>>> feedback would be appreciated.
>>>>
>>>> Ralph
>>>>
>>>>
>>>> On Apr 14, 2009, at 7:57 AM, Ralph Castain wrote:
>>>>
>>>> Ah now, I didn't say it -worked-, did I? :-)
>>>>
>>>> Clearly a bug exists in the program. I'll try to take a look at it
>>>> (if Lenny
>>>> doesn't get to it first), but it won't be until later in the week.
>>>>
>>>> On Apr 14, 2009, at 7:18 AM, Geoffroy Pignot wrote:
>>>>
>>>> I agree with you Ralph , and that 's what I expect from openmpi
>>>> but my
>>>> second example shows that it's not working
>>>>
>>>> cat hostfile.0
>>>> r011n002 slots=4
>>>> r011n003 slots=4
>>>>
>>>> cat rankfile.0
>>>> rank 0=r011n002 slot=0
>>>> rank 1=r011n003 slot=1
>>>>
>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
>>>> hostname
>>>> ### CRASHED
>>>>
>>>>>> Error, invalid rank (1) in the rankfile (rankfile.0)
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> rmaps_rank_file.c at line 404
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> base/rmaps_base_map_job.c at line 87
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> base/plm_base_launch_support.c at line 77
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> plm_rsh_module.c at line 985
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> A daemon (pid unknown) died unexpectedly on signal 1 while
>>>>> attempting to
>>>>>> launch so we are aborting.
>>>>>>
>>>>>> There may be more information reported by the environment (see
>>>>> above).
>>>>>>
>>>>>> This may be because the daemon was unable to find all the needed
>>>>> shared
>>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>> have the
>>>>>> location of the shared libraries on the remote nodes and this
>>>>>> will
>>>>>> automatically be forwarded to the remote nodes.
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> orterun noticed that the job aborted, but has no info as to the
>>>>> process
>>>>>> that caused that situation.
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> orterun: clean termination accomplished
>>>>
>>>>
>>>>
>>>> Message: 4
>>>> Date: Tue, 14 Apr 2009 06:55:58 -0600
>>>> From: Ralph Castain <rhc_at_[hidden]>
>>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>>> To: Open MPI Users <users_at_[hidden]>
>>>> Message-ID: <F6290ADA-A196-43F0-A853-CBCB802D8D9C_at_[hidden]>
>>>> Content-Type: text/plain; charset="us-ascii"; Format="flowed";
>>>> DelSp="yes"
>>>>
>>>> The rankfile cuts across the entire job - it isn't applied on an
>>>> app_context basis. So the ranks in your rankfile must correspond to
>>>> the eventual rank of each process in the cmd line.
>>>>
>>>> Unfortunately, that means you have to count ranks. In your case,
>>>> you
>>>> only have four, so that makes life easier. Your rankfile would look
>>>> something like this:
>>>>
>>>> rank 0=r001n001 slot=0
>>>> rank 1=r001n002 slot=1
>>>> rank 2=r001n001 slot=1
>>>> rank 3=r001n002 slot=2
>>>>
>>>> HTH
>>>> Ralph
>>>>
>>>> On Apr 14, 2009, at 12:19 AM, Geoffroy Pignot wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I agree that my examples are not very clear. What I want to do
>>>>> is to
>>>>> launch a multiexes application (masters-slaves) and benefit from
>>>>> the
>>>>> processor affinity.
>>>>> Could you show me how to convert this command , using -rf option
>>>>> (whatever the affinity is)
>>>>>
>>>>> mpirun -n 1 -host r001n001 master.x options1 : -n 1 -host
>>>>> r001n002
>>>>> master.x options2 : -n 1 -host r001n001 slave.x options3 : -n 1 -
>>>>> host r001n002 slave.x options4
>>>>>
>>>>> Thanks for your help
>>>>>
>>>>> Geoffroy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Message: 2
>>>>> Date: Sun, 12 Apr 2009 18:26:35 +0300
>>>>> From: Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]>
>>>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>>>> To: Open MPI Users <users_at_[hidden]>
>>>>> Message-ID:
>>>>>
>>>>> <453d39990904120826t2e1d1d33l7bb1fe3de65b5361_at_[hidden]>
>>>>> Content-Type: text/plain; charset="iso-8859-1"
>>>>>
>>>>> Hi,
>>>>>
>>>>> The first "crash" is OK, since your rankfile has ranks 0 and 1
>>>>> defined,
>>>>> while n=1, which means only rank 0 is present and can be
>>>>> allocated.
>>>>>
>>>>> NP must be >= the largest rank in rankfile.
>>>>>
>>>>> What exactly are you trying to do ?
>>>>>
>>>>> I tried to recreate your seqv but all I got was
>>>>>
>>>>> ~/work/svn/ompi/trunk/build_x86-64/install/bin/mpirun --hostfile
>>>>> hostfile.0
>>>>> -rf rankfile.0 -n 1 hostname : -rf rankfile.1 -n 1 hostname
>>>>> [witch19:30798] mca: base: component_find: paffinity
>>>>> "mca_paffinity_linux"
>>>>> uses an MCA interface that is not recognized (component MCA
>>>> v1.0.0 !=
>>>>> supported MCA v2.0.0) -- ignored
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>> It looks like opal_init failed for some reason; your parallel
>>>>> process is
>>>>> likely to abort. There are many reasons that a parallel process
>>>>> can
>>>>> fail during opal_init; some of which are due to configuration or
>>>>> environment problems. This failure appears to be an internal
>>>> failure;
>>>>> here's some additional information (which may only be relevant
>>>>> to an
>>>>> Open MPI developer):
>>>>>
>>>>> opal_carto_base_select failed
>>>>> --> Returned value -13 instead of OPAL_SUCCESS
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>> [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>>> file
>>>>> ../../orte/runtime/orte_init.c at line 78
>>>>> [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>>> file
>>>>> ../../orte/orted/orted_main.c at line 344
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>> A daemon (pid 11629) died unexpectedly with status 243 while
>>>>> attempting
>>>>> to launch so we are aborting.
>>>>>
>>>>> There may be more information reported by the environment (see
>>>> above).
>>>>>
>>>>> This may be because the daemon was unable to find all the needed
>>>>> shared
>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>> have the
>>>>> location of the shared libraries on the remote nodes and this will
>>>>> automatically be forwarded to the remote nodes.
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>> mpirun noticed that the job aborted, but has no info as to the
>>>> process
>>>>> that caused that situation.
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>> mpirun: clean termination accomplished
>>>>>
>>>>>
>>>>> Lenny.
>>>>>
>>>>>
>>>>> On 4/10/09, Geoffroy Pignot <geopignot_at_[hidden]> wrote:
>>>>>>
>>>>>> Hi ,
>>>>>>
>>>>>> I am currently testing the process affinity capabilities of
>>>>> openmpi and I
>>>>>> would like to know if the rankfile behaviour I will describe
>>>>>> below
>>>>> is normal
>>>>>> or not ?
>>>>>>
>>>>>> cat hostfile.0
>>>>>> r011n002 slots=4
>>>>>> r011n003 slots=4
>>>>>>
>>>>>> cat rankfile.0
>>>>>> rank 0=r011n002 slot=0
>>>>>> rank 1=r011n003 slot=1
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> ##################################################################################
>>>>>>
>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 2 hostname ### OK
>>>>>> r011n002
>>>>>> r011n003
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> ##################################################################################
>>>>>> but
>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
>>>>> hostname
>>>>>> ### CRASHED
>>>>>> *
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> Error, invalid rank (1) in the rankfile (rankfile.0)
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> rmaps_rank_file.c at line 404
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> base/rmaps_base_map_job.c at line 87
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> base/plm_base_launch_support.c at line 77
>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>>> file
>>>>>> plm_rsh_module.c at line 985
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> A daemon (pid unknown) died unexpectedly on signal 1 while
>>>>> attempting to
>>>>>> launch so we are aborting.
>>>>>>
>>>>>> There may be more information reported by the environment (see
>>>>> above).
>>>>>>
>>>>>> This may be because the daemon was unable to find all the needed
>>>>> shared
>>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>> have the
>>>>>> location of the shared libraries on the remote nodes and this
>>>>>> will
>>>>>> automatically be forwarded to the remote nodes.
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> orterun noticed that the job aborted, but has no info as to the
>>>>> process
>>>>>> that caused that situation.
>>>>>>
>>>>>
>>>>
>>> --------------------------------------------------------------------------
>>>>>> orterun: clean termination accomplished
>>>>>> *
>>>>>> It seems that the rankfile option is not propagted to the second
>>>>> command
>>>>>> line ; there is no global understanding of the ranking inside a
>>>>> mpirun
>>>>>> command.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> ##################################################################################
>>>>>>
>>>>>> Assuming that , I tried to provide a rankfile to each command
>>>> line:
>>>>>>
>>>>>> cat rankfile.0
>>>>>> rank 0=r011n002 slot=0
>>>>>>
>>>>>> cat rankfile.1
>>>>>> rank 0=r011n003 slot=1
>>>>>>
>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -rf
>>>>> rankfile.1
>>>>>> -n 1 hostname ### CRASHED
>>>>>> *[r011n002:28778] *** Process received signal ***
>>>>>> [r011n002:28778] Signal: Segmentation fault (11)
>>>>>> [r011n002:28778] Signal code: Address not mapped (1)
>>>>>> [r011n002:28778] Failing at address: 0x34
>>>>>> [r011n002:28778] [ 0] [0xffffe600]
>>>>>> [r011n002:28778] [ 1]
>>>>>> /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
>>>>> 0(orte_odls_base_default_get_add_procs_data+0x55d)
>>>>>> [0x5557decd]
>>>>>> [r011n002:28778] [ 2]
>>>>>> /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
>>>>> 0(orte_plm_base_launch_apps+0x117)
>>>>>> [0x555842a7]
>>>>>> [r011n002:28778] [ 3] /tmp/HALMPI/openmpi-1.3.1/lib/openmpi/
>>>>> mca_plm_rsh.so
>>>>>> [0x556098c0]
>>>>>> [r011n002:28778] [ 4] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>>>> [0x804aa27]
>>>>>> [r011n002:28778] [ 5] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>>>> [0x804a022]
>>>>>> [r011n002:28778] [ 6] /lib/libc.so.6(__libc_start_main+0xdc)
>>>>> [0x9f1dec]
>>>>>> [r011n002:28778] [ 7] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>>>> [0x8049f71]
>>>>>> [r011n002:28778] *** End of error message ***
>>>>>> Segmentation fault (core dumped)*
>>>>>>
>>>>>>
>>>>>>
>>>>>> I hope that I've found a bug because it would be very important
>>>>> for me to
>>>>>> have this kind of capabiliy .
>>>>>> Launch a multiexe mpirun command line and be able to bind my exes
>>>>> and
>>>>>> sockets together.
>>>>>>
>>>>>> Thanks in advance for your help
>>>>>>
>>>>>> Geoffroy
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> -------------- next part --------------
>>>> HTML attachment scrubbed and removed
>>>>
>>>> ------------------------------
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> End of users Digest, Vol 1202, Issue 2
>>>> **************************************
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> -------------- next part --------------
>>>> HTML attachment scrubbed and removed
>>>>
>>>> ------------------------------
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> End of users Digest, Vol 1218, Issue 2
>>>> **************************************
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> -------------- next part --------------
>>> HTML attachment scrubbed and removed
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> End of users Digest, Vol 1221, Issue 3
>>> **************************************
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> -------------- next part --------------
> HTML attachment scrubbed and removed
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 1221, Issue 6
> **************************************