Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-05-04 15:34:40


Hmmm...I'm afraid I can't replicate the problem. All seems to be working
just fine on the RHEL systems available to me. The procs indeed bind to the
specified processors in every case.

rhc_at_odin ~/trunk]$ cat rankfile
rank 0=odin001 slot=0
rank 1=odin002 slot=1

[rhc_at_odin mpi]$ mpirun -rf ../../../rankfile -n 2 --leave-session-attached
-mca paffinity_base_verbose 5 ./mpi_spin
[odin001.cs.indiana.edu:09297 <http://odin001.cs.indiana.edu:9297/>]
paffinity slot assignment: slot_list == 0
[odin001.cs.indiana.edu:09297 <http://odin001.cs.indiana.edu:9297/>]
paffinity slot assignment: rank 0 runs on cpu #0 (#0)
[odin002.cs.indiana.edu:13566] paffinity slot assignment: slot_list == 1
[odin002.cs.indiana.edu:13566] paffinity slot assignment: rank 1 runs on cpu
#1 (#1)

Suspended
[rhc_at_odin mpi]$ ssh odin001
[rhc_at_odin001 ~]$ ps axo stat,user,psr,pid,pcpu,comm | grep rhc
S rhc 0 9296 0.0 orted
RLl rhc 0 9297 100 mpi_spin

[rhc_at_odin mpi]$ ssh odin002
[rhc_at_odin002 ~]$ ps axo stat,user,psr,pid,pcpu,comm | grep rhc
S rhc 0 13562 0.0 orted
RLl rhc 1 13566 102 mpi_spin

Not sure where to go from here...perhaps someone else can spot the problem?
Ralph

On Mon, May 4, 2009 at 8:28 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> Unfortunately, I didn't write any of that code - I was just fixing the
> mapper so it would properly map the procs. From what I can tell, the proper
> things are happening there.
>
> I'll have to dig into the code that specifically deals with parsing the
> results to bind the processes. Afraid that will take awhile longer - pretty
> dark in that hole.
>
>
>
> On Mon, May 4, 2009 at 8:04 AM, Geoffroy Pignot <geopignot_at_[hidden]>wrote:
>
>> Hi,
>>
>> So, there are no more crashes with my "crazy" mpirun command. But the
>> paffinity feature seems to be broken. Indeed I am not able to pin my
>> processes.
>>
>> Simple test with a program using your plpa library :
>>
>> r011n006% cat hostf
>> r011n006 slots=4
>>
>> r011n006% cat rankf
>> rank 0=r011n006 slot=0 ----> bind to CPU 0 , exact ?
>>
>> r011n006% /tmp/HALMPI/openmpi-1.4a/bin/mpirun --hostfile hostf --rankfile
>> rankf --wdir /tmp -n 1 a.out
>> >>> PLPA Number of processors online: 4
>> >>> PLPA Number of processor sockets: 2
>> >>> PLPA Socket 0 (ID 0): 2 cores
>> >>> PLPA Socket 1 (ID 3): 2 cores
>>
>> Ctrl+Z
>> r011n006%bg
>>
>> r011n006% ps axo stat,user,psr,pid,pcpu,comm | grep gpignot
>> R+ gpignot 3 9271 97.8 a.out
>>
>> In fact whatever the slot number I put in my rankfile , a.out always runs
>> on the CPU 3. I was looking for it on CPU 0 accordind to my cpuinfo file
>> (see below)
>> The result is the same if I try another syntax (rank 0=r011n006 slot=0:0
>> bind to socket 0 - core 0 , exact ? )
>>
>> Thanks in advance
>>
>> Geoffroy
>>
>> PS: I run on rhel5
>>
>> r011n006% uname -a
>> Linux r011n006 2.6.18-92.1.1NOMAP32.el5 #1 SMP Sat Mar 15 01:46:39 CDT
>> 2008 x86_64 x86_64 x86_64 GNU/Linux
>>
>> My configure is :
>> ./configure --prefix=/tmp/openmpi-1.4a --libdir='${exec_prefix}/lib64'
>> --disable-dlopen --disable-mpi-cxx --enable-heterogeneous
>>
>>
>> r011n006% cat /proc/cpuinfo
>> processor : 0
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 0
>> siblings : 2
>> core id : 0
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5323.68
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>> processor : 1
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 3
>> siblings : 2
>> core id : 0
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5320.03
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>> processor : 2
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 0
>> siblings : 2
>> core id : 1
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5319.39
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>> processor : 3
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
>> stepping : 6
>> cpu MHz : 2660.007
>> cache size : 4096 KB
>> physical id : 3
>> siblings : 2
>> core id : 1
>> cpu cores : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
>> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> bogomips : 5320.03
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>>
>>> ------------------------------
>>>
>>> Message: 2
>>> Date: Mon, 4 May 2009 04:45:57 -0600
>>> From: Ralph Castain <rhc_at_[hidden]>
>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>> To: Open MPI Users <users_at_[hidden]>
>>> Message-ID: <D01D7B16-4B47-46F3-AD41-D1A90B2E4927_at_[hidden]>
>>>
>>> Content-Type: text/plain; charset="us-ascii"; Format="flowed";
>>> DelSp="yes"
>>>
>>> My apologies - I wasn't clear enough. You need a tarball from r21111
>>> or greater...such as:
>>>
>>> http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r21142.tar.gz
>>>
>>> HTH
>>> Ralph
>>>
>>>
>>> On May 4, 2009, at 2:14 AM, Geoffroy Pignot wrote:
>>>
>>> > Hi ,
>>> >
>>> > I got the openmpi-1.4a1r21095.tar.gz tarball, but unfortunately my
>>> > command doesn't work
>>> >
>>> > cat rankf:
>>> > rank 0=node1 slot=*
>>> > rank 1=node2 slot=*
>>> >
>>> > cat hostf:
>>> > node1 slots=2
>>> > node2 slots=2
>>> >
>>> > mpirun --rankfile rankf --hostfile hostf --host node1 -n 1
>>> > hostname : --host node2 -n 1 hostname
>>> >
>>> > Error, invalid rank (1) in the rankfile (rankf)
>>> >
>>> >
>>> --------------------------------------------------------------------------
>>> > [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>> > rmaps_rank_file.c at line 403
>>> > [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>> > base/rmaps_base_map_job.c at line 86
>>> > [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>> > base/plm_base_launch_support.c at line 86
>>> > [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>> > plm_rsh_module.c at line 1016
>>> >
>>> >
>>> > Ralph, could you tell me if my command syntax is correct or not ? if
>>> > not, give me the expected one ?
>>> >
>>> > Regards
>>> >
>>> > Geoffroy
>>> >
>>> >
>>> >
>>> >
>>> > 2009/4/30 Geoffroy Pignot <geopignot_at_[hidden]>
>>> > Immediately Sir !!! :)
>>> >
>>> > Thanks again Ralph
>>> >
>>> > Geoffroy
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ------------------------------
>>> >
>>> > Message: 2
>>> > Date: Thu, 30 Apr 2009 06:45:39 -0600
>>> > From: Ralph Castain <rhc_at_[hidden]>
>>> > Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>> > To: Open MPI Users <users_at_[hidden]>
>>> > Message-ID:
>>> > <71d2d8cc0904300545v61a42fe1k50086d2704d0f7e6_at_[hidden]>
>>> > Content-Type: text/plain; charset="iso-8859-1"
>>> >
>>> > I believe this is fixed now in our development trunk - you can
>>> > download any
>>> > tarball starting from last night and give it a try, if you like. Any
>>> > feedback would be appreciated.
>>> >
>>> > Ralph
>>> >
>>> >
>>> > On Apr 14, 2009, at 7:57 AM, Ralph Castain wrote:
>>> >
>>> > Ah now, I didn't say it -worked-, did I? :-)
>>> >
>>> > Clearly a bug exists in the program. I'll try to take a look at it
>>> > (if Lenny
>>> > doesn't get to it first), but it won't be until later in the week.
>>> >
>>> > On Apr 14, 2009, at 7:18 AM, Geoffroy Pignot wrote:
>>> >
>>> > I agree with you Ralph , and that 's what I expect from openmpi but my
>>> > second example shows that it's not working
>>> >
>>> > cat hostfile.0
>>> > r011n002 slots=4
>>> > r011n003 slots=4
>>> >
>>> > cat rankfile.0
>>> > rank 0=r011n002 slot=0
>>> > rank 1=r011n003 slot=1
>>> >
>>> > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
>>> > hostname
>>> > ### CRASHED
>>> >
>>> > > > Error, invalid rank (1) in the rankfile (rankfile.0)
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > rmaps_rank_file.c at line 404
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > base/rmaps_base_map_job.c at line 87
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > base/plm_base_launch_support.c at line 77
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > plm_rsh_module.c at line 985
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > A daemon (pid unknown) died unexpectedly on signal 1 while
>>> > > attempting to
>>> > > > launch so we are aborting.
>>> > > >
>>> > > > There may be more information reported by the environment (see
>>> > > above).
>>> > > >
>>> > > > This may be because the daemon was unable to find all the needed
>>> > > shared
>>> > > > libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>> > > have the
>>> > > > location of the shared libraries on the remote nodes and this will
>>> > > > automatically be forwarded to the remote nodes.
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > orterun noticed that the job aborted, but has no info as to the
>>> > > process
>>> > > > that caused that situation.
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > orterun: clean termination accomplished
>>> >
>>> >
>>> >
>>> > Message: 4
>>> > Date: Tue, 14 Apr 2009 06:55:58 -0600
>>> > From: Ralph Castain <rhc_at_[hidden]>
>>> > Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>> > To: Open MPI Users <users_at_[hidden]>
>>> > Message-ID: <F6290ADA-A196-43F0-A853-CBCB802D8D9C_at_[hidden]>
>>> > Content-Type: text/plain; charset="us-ascii"; Format="flowed";
>>> > DelSp="yes"
>>> >
>>> > The rankfile cuts across the entire job - it isn't applied on an
>>> > app_context basis. So the ranks in your rankfile must correspond to
>>> > the eventual rank of each process in the cmd line.
>>> >
>>> > Unfortunately, that means you have to count ranks. In your case, you
>>> > only have four, so that makes life easier. Your rankfile would look
>>> > something like this:
>>> >
>>> > rank 0=r001n001 slot=0
>>> > rank 1=r001n002 slot=1
>>> > rank 2=r001n001 slot=1
>>> > rank 3=r001n002 slot=2
>>> >
>>> > HTH
>>> > Ralph
>>> >
>>> > On Apr 14, 2009, at 12:19 AM, Geoffroy Pignot wrote:
>>> >
>>> > > Hi,
>>> > >
>>> > > I agree that my examples are not very clear. What I want to do is to
>>> > > launch a multiexes application (masters-slaves) and benefit from the
>>> > > processor affinity.
>>> > > Could you show me how to convert this command , using -rf option
>>> > > (whatever the affinity is)
>>> > >
>>> > > mpirun -n 1 -host r001n001 master.x options1 : -n 1 -host r001n002
>>> > > master.x options2 : -n 1 -host r001n001 slave.x options3 : -n 1 -
>>> > > host r001n002 slave.x options4
>>> > >
>>> > > Thanks for your help
>>> > >
>>> > > Geoffroy
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > Message: 2
>>> > > Date: Sun, 12 Apr 2009 18:26:35 +0300
>>> > > From: Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]>
>>> > > Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
>>> > > To: Open MPI Users <users_at_[hidden]>
>>> > > Message-ID:
>>> > > <453d39990904120826t2e1d1d33l7bb1fe3de65b5361_at_[hidden]>
>>> > > Content-Type: text/plain; charset="iso-8859-1"
>>> > >
>>> > > Hi,
>>> > >
>>> > > The first "crash" is OK, since your rankfile has ranks 0 and 1
>>> > > defined,
>>> > > while n=1, which means only rank 0 is present and can be allocated.
>>> > >
>>> > > NP must be >= the largest rank in rankfile.
>>> > >
>>> > > What exactly are you trying to do ?
>>> > >
>>> > > I tried to recreate your seqv but all I got was
>>> > >
>>> > > ~/work/svn/ompi/trunk/build_x86-64/install/bin/mpirun --hostfile
>>> > > hostfile.0
>>> > > -rf rankfile.0 -n 1 hostname : -rf rankfile.1 -n 1 hostname
>>> > > [witch19:30798] mca: base: component_find: paffinity
>>> > > "mca_paffinity_linux"
>>> > > uses an MCA interface that is not recognized (component MCA
>>> > v1.0.0 !=
>>> > > supported MCA v2.0.0) -- ignored
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > It looks like opal_init failed for some reason; your parallel
>>> > > process is
>>> > > likely to abort. There are many reasons that a parallel process can
>>> > > fail during opal_init; some of which are due to configuration or
>>> > > environment problems. This failure appears to be an internal
>>> > failure;
>>> > > here's some additional information (which may only be relevant to an
>>> > > Open MPI developer):
>>> > >
>>> > > opal_carto_base_select failed
>>> > > --> Returned value -13 instead of OPAL_SUCCESS
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>> > file
>>> > > ../../orte/runtime/orte_init.c at line 78
>>> > > [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>> > file
>>> > > ../../orte/orted/orted_main.c at line 344
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > A daemon (pid 11629) died unexpectedly with status 243 while
>>> > > attempting
>>> > > to launch so we are aborting.
>>> > >
>>> > > There may be more information reported by the environment (see
>>> > above).
>>> > >
>>> > > This may be because the daemon was unable to find all the needed
>>> > > shared
>>> > > libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>> > > have the
>>> > > location of the shared libraries on the remote nodes and this will
>>> > > automatically be forwarded to the remote nodes.
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > mpirun noticed that the job aborted, but has no info as to the
>>> > process
>>> > > that caused that situation.
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > mpirun: clean termination accomplished
>>> > >
>>> > >
>>> > > Lenny.
>>> > >
>>> > >
>>> > > On 4/10/09, Geoffroy Pignot <geopignot_at_[hidden]> wrote:
>>> > > >
>>> > > > Hi ,
>>> > > >
>>> > > > I am currently testing the process affinity capabilities of
>>> > > openmpi and I
>>> > > > would like to know if the rankfile behaviour I will describe below
>>> > > is normal
>>> > > > or not ?
>>> > > >
>>> > > > cat hostfile.0
>>> > > > r011n002 slots=4
>>> > > > r011n003 slots=4
>>> > > >
>>> > > > cat rankfile.0
>>> > > > rank 0=r011n002 slot=0
>>> > > > rank 1=r011n003 slot=1
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> >
>>> ##################################################################################
>>> > > >
>>> > > > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 2 hostname ### OK
>>> > > > r011n002
>>> > > > r011n003
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> >
>>> ##################################################################################
>>> > > > but
>>> > > > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
>>> > > hostname
>>> > > > ### CRASHED
>>> > > > *
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > Error, invalid rank (1) in the rankfile (rankfile.0)
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > rmaps_rank_file.c at line 404
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > base/rmaps_base_map_job.c at line 87
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > base/plm_base_launch_support.c at line 77
>>> > > > [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
>>> > file
>>> > > > plm_rsh_module.c at line 985
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > A daemon (pid unknown) died unexpectedly on signal 1 while
>>> > > attempting to
>>> > > > launch so we are aborting.
>>> > > >
>>> > > > There may be more information reported by the environment (see
>>> > > above).
>>> > > >
>>> > > > This may be because the daemon was unable to find all the needed
>>> > > shared
>>> > > > libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>> > > have the
>>> > > > location of the shared libraries on the remote nodes and this will
>>> > > > automatically be forwarded to the remote nodes.
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > orterun noticed that the job aborted, but has no info as to the
>>> > > process
>>> > > > that caused that situation.
>>> > > >
>>> > >
>>> >
>>> --------------------------------------------------------------------------
>>> > > > orterun: clean termination accomplished
>>> > > > *
>>> > > > It seems that the rankfile option is not propagted to the second
>>> > > command
>>> > > > line ; there is no global understanding of the ranking inside a
>>> > > mpirun
>>> > > > command.
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> >
>>> ##################################################################################
>>> > > >
>>> > > > Assuming that , I tried to provide a rankfile to each command
>>> > line:
>>> > > >
>>> > > > cat rankfile.0
>>> > > > rank 0=r011n002 slot=0
>>> > > >
>>> > > > cat rankfile.1
>>> > > > rank 0=r011n003 slot=1
>>> > > >
>>> > > > mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -rf
>>> > > rankfile.1
>>> > > > -n 1 hostname ### CRASHED
>>> > > > *[r011n002:28778] *** Process received signal ***
>>> > > > [r011n002:28778] Signal: Segmentation fault (11)
>>> > > > [r011n002:28778] Signal code: Address not mapped (1)
>>> > > > [r011n002:28778] Failing at address: 0x34
>>> > > > [r011n002:28778] [ 0] [0xffffe600]
>>> > > > [r011n002:28778] [ 1]
>>> > > > /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
>>> > > 0(orte_odls_base_default_get_add_procs_data+0x55d)
>>> > > > [0x5557decd]
>>> > > > [r011n002:28778] [ 2]
>>> > > > /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
>>> > > 0(orte_plm_base_launch_apps+0x117)
>>> > > > [0x555842a7]
>>> > > > [r011n002:28778] [ 3] /tmp/HALMPI/openmpi-1.3.1/lib/openmpi/
>>> > > mca_plm_rsh.so
>>> > > > [0x556098c0]
>>> > > > [r011n002:28778] [ 4] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>> > > [0x804aa27]
>>> > > > [r011n002:28778] [ 5] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>> > > [0x804a022]
>>> > > > [r011n002:28778] [ 6] /lib/libc.so.6(__libc_start_main+0xdc)
>>> > > [0x9f1dec]
>>> > > > [r011n002:28778] [ 7] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
>>> > > [0x8049f71]
>>> > > > [r011n002:28778] *** End of error message ***
>>> > > > Segmentation fault (core dumped)*
>>> > > >
>>> > > >
>>> > > >
>>> > > > I hope that I've found a bug because it would be very important
>>> > > for me to
>>> > > > have this kind of capabiliy .
>>> > > > Launch a multiexe mpirun command line and be able to bind my exes
>>> > > and
>>> > > > sockets together.
>>> > > >
>>> > > > Thanks in advance for your help
>>> > > >
>>> > > > Geoffroy
>>> > > _______________________________________________
>>> > > users mailing list
>>> > > users_at_[hidden]
>>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> > -------------- next part --------------
>>> > HTML attachment scrubbed and removed
>>> >
>>> > ------------------------------
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> > End of users Digest, Vol 1202, Issue 2
>>> > **************************************
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > -------------- next part --------------
>>> > HTML attachment scrubbed and removed
>>> >
>>> > ------------------------------
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> > End of users Digest, Vol 1218, Issue 2
>>> > **************************************
>>> >
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> -------------- next part --------------
>>> HTML attachment scrubbed and removed
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> End of users Digest, Vol 1221, Issue 3
>>> **************************************
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>