Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Hammad Siddiqi (hammad.siddiqi_at_[hidden])
Date: 2007-10-01 03:08:04


One more thing to add -mca mtl mx uses ethernet and IP emulation of
Myrinet to my knowledge. I want to use Myrinet(not its IP Emulation)
and shared memory simultaneously.
Thanks
Regards, Hammad

Hammad Siddiqi wrote:
> Dear Tim,
>
> Your and Tim Matox's suggestion yielded following results,
>
> *1. /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
> "indus1,indus2" -mca btl_base_debug 1000 ./hello*
>
> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca btl mx,sm,self -host
> "indus1,indus2,indus3,indus4" -mca btl_base_debug 1000 ./hello
> [indus1:29331] select: initializing btl component mx
> [indus1:29331] select: init returned failure
> [indus1:29331] select: module mx unloaded
> [indus1:29331] select: initializing btl component sm
> [indus1:29331] select: init returned success
> [indus1:29331] select: initializing btl component self
> [indus1:29331] select: init returned success
> [indus3:13520] select: initializing btl component mx
> [indus3:13520] select: init returned failure
> [indus3:13520] select: module mx unloaded
> [indus3:13520] select: initializing btl component sm
> [indus3:13520] select: init returned success
> [indus3:13520] select: initializing btl component self
> [indus3:13520] select: init returned success
> [indus4:15486] select: initializing btl component mx
> [indus4:15486] select: init returned failure
> [indus4:15486] select: module mx unloaded
> [indus4:15486] select: initializing btl component sm
> [indus4:15486] select: init returned success
> [indus4:15486] select: initializing btl component self
> [indus4:15486] select: init returned success
> [indus2:11351] select: initializing btl component mx
> [indus2:11351] select: init returned failure
> [indus2:11351] select: module mx unloaded
> [indus2:11351] select: initializing btl component sm
> [indus2:11351] select: init returned success
> [indus2:11351] select: initializing btl component self
> [indus2:11351] select: init returned success
> --------------------------------------------------------------------------
> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> --------------------------------------------------------------------------
> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during
> MPI_INIT--------------------------------------------------------------------------
> Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> ; some of which are due to configuration or environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
>
>
>
>
> *2.1 /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca mtl mx -host
> "indus1,indus2,indus3,indus4" ./hello*
>
> This command works fine
>
> *2.2 /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca mtl mx -host
> "indus1,indus2,indus3,indus4" -mca pml cm ./hello*
>
> This command works fine.
> Also *"/opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca pml cm -host
> "indus1,indus2,indus3,indus4" -mca mtl_base_debug 1000 ./hello"*,
> this command works fine.
> but *"/opt/SUNWhpc/HPC7.0/bin/mpirun -np 8 -mca pml cm -host
> "indus1,indus2,indus3,indus4" -mca mtl_base_debug 1000 ./hello"*
> hangs for indefinite time.
>
>
> Also *"/opt/SUNWhpc/HPC7.0/bin/mpirun -np 8 -mca mtl mx,sm,self -host
> "indus1,indus2,indus3,indus4" -mca mtl_base_debug 1000 ./hello"*
> works fine
>
> *2.3 /opt/SUNWhpc/HPC7.0/bin/mpirun -np 8 -mca mtl mx -host
> "indus1,indus2,indus3,indus4" -mca pml cm ./hello*
>
> This command hangs the machines for indefinite time.
> Also *"/opt/SUNWhpc/HPC7.0/bin/mpirun -np 8 -mca mtl mx -host
> "indus1,indus2,indus3,indus4" -mca pml cm -mca mtl_base_debug 1000
> ./hello"* hangs the
> systems for indefinite time.
>
> *2.4 /opt/SUNWhpc/HPC7.0/bin/mpirun -np 8 -mca mtl mx,sm,self -host
> "indus1,indus2,indus3,indus4" -mca pml cm -mca mtl_base_debug 1000
> ./hello*
>
> This command hangs the machines for indefinite time.
>
> Please notice that running more than four mpi processes hangs the
> machines. Any suggestion please.
>
> Thanks,
>
> Best Regards,
> Hammad Siddiqi
>
> Tim Prins wrote:
>> I would reccommend trying a few things:
>>
>> 1. Set some debugging flags and see if that helps. So, I would try something
>> like:
>> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,self -host "indus1,indus2" -mca btl_base_debug 1000 ./hello
>>
>> This will output information as each btl is loaded, and whether or not the
>> load succeeds.
>>
>> 2. Try running with the mx mtl instead of the btl:
>> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca mtl mx -host "indus1,indus2" ./hello
>>
>> Similarly, for debug output:
>> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca mtl mx -host "indus1,indus2" -mca
>> mtl_base_debug 1000 ./hello
>>
>> Let me know if any of these work.
>>
>> Thanks,
>>
>> Tim
>>
>> On Saturday 29 September 2007 01:53:06 am Hammad Siddiqi wrote:
>>
>>> Hi Terry,
>>>
>>> Thanks for replying. The following command is working fine:
>>>
>>> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca btl tcp,sm,self -machinefile
>>> machines ./hello
>>>
>>> The contents of machines are:
>>> indus1
>>> indus2
>>> indus3
>>> indus4
>>>
>>> I have tried using np=2 over pairs of machines, but the problem is same.
>>> The errors that occur are given below with the command that I am trying.
>>>
>>> **Test 1**
>>>
>>> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
>>> "indus1,indus2" ./hello
>>> --------------------------------------------------------------------------
>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> --------------------------------------------------------------------------
>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>
>>> **Test 2*
>>>
>>> */opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
>>> "indus1,indus3" ./hello
>>> --------------------------------------------------------------------------
>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> --------------------------------------------------------------------------
>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> *
>>> *Test 3*
>>> */opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
>>> "indus1,indus4" ./hello
>>> --------------------------------------------------------------------------
>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> --------------------------------------------------------------------------
>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>
>>> **Test4**
>>>
>>> /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
>>> "indus2,indus4" ./hello
>>> --------------------------------------------------------------------------
>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> --------------------------------------------------------------------------
>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> *
>>>
>>> *Test5*
>>>
>>> * /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
>>> "indus2,indus3" ./hello
>>> --------------------------------------------------------------------------
>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> --------------------------------------------------------------------------
>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>
>>> **Test 6*
>>>
>>> * /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host
>>> "indus3,indus4" ./hello
>>> --------------------------------------------------------------------------
>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> --------------------------------------------------------------------------
>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>> If you specified the use of a BTL component, you may have
>>> forgotten a component (such as "self") in the list of
>>> usable components.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems. This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>> PML add procs failed
>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** before MPI was initialized
>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>
>>> **END OF TESTS**
>>>
>>> There is one thing to note that when I run this command including -mca
>>> pml cm it works fine :S
>>>
>>> mpirun -np 4 -mca btl mx,sm,self -mca pml cm -machinefile machines ./hello
>>> Hello MPI! Process 4 of 1 on indus2
>>> Hello MPI! Process 4 of 2 on indus3
>>> Hello MPI! Process 4 of 3 on indus4
>>> Hello MPI! Process 4 of 0 on indus1
>>>
>>> To my knowledge this command is not using shared memory and is only
>>> using myrinet as interconnect.
>>> One more thing I cannot start more than 4 processes in this case, The
>>> mpirun process hangs.
>>>
>>> Any suggestions?
>>>
>>> Once again, thanks for your help.
>>>
>>> Regards,
>>> Hammad
>>>
>>> Terry Dontje wrote:
>>>
>>>> Hi Hammad,
>>>>
>>>> It looks to me like none of the btl's could resolve a route between the
>>>> node that process rank 0 is on to the other nodes.
>>>> I would suggest trying np=2 over a couple pairs of machines to see if
>>>> that works and you can truly be sure that only the
>>>> first node is having this problem.
>>>>
>>>> It also might be helpful as a sanity check to use the tcp btl instead of
>>>> mx and see if you get more traction with that.
>>>>
>>>> --td
>>>>
>>>>
>>>>> *From:* Hammad Siddiqi (/hammad.siddiqi_at_[hidden]/)
>>>>> *Date:* 2007-09-28 07:38:01
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I am using Sun HPC Toolkit 7.0 to compile and run my C MPI programs.
>>>>>
>>>>> I have tested the myrinet installations using myricoms own test
>>>>> programs. The Myricom software stack I am using is MX and the vesrion is
>>>>> mx2g-1.1.7, mx_mapper is also used.
>>>>> We have 4 nodes having 8 dual core processors each (Sun Fire v890) and
>>>>> the operating system is
>>>>> Solaris 10 (SunOS indus1 5.10 Generic_125100-10 sun4u sparc
>>>>> SUNW,Sun-Fire-V890).
>>>>>
>>>>> The contents of machine file are:
>>>>> indus1
>>>>> indus2
>>>>> indus3
>>>>> indus4
>>>>>
>>>>> The output of *mx_info* on each node is given below
>>>>>
>>>>> =====*=
>>>>> indus1
>>>>> *======
>>>>>
>>>>> MX Version: 1.1.7rc3cvs1_1_fixes
>>>>> MX Build: @indus4:/opt/mx2g-1.1.7rc3 Thu May 31 11:36:59 PKT 2007
>>>>> 2 Myrinet boards installed.
>>>>> The MX driver is configured to support up to 4 instances and 1024 nodes.
>>>>> ===================================================================
>>>>> Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link up
>>>>> MAC Address: 00:60:dd:47:ad:7c
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 297218
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>>
>>>>> ROUTE COUNT
>>>>> INDEX MAC ADDRESS HOST NAME P0
>>>>> ----- -----------
>>>>> --------- ---
>>>>> 0) 00:60:dd:47:ad:7c indus1:0 1,1
>>>>> 2) 00:60:dd:47:ad:68 indus4:0 8,3
>>>>> 3) 00:60:dd:47:b3:e8 indus4:1 7,3
>>>>> 4) 00:60:dd:47:b3:ab indus2:0 7,3
>>>>> 5) 00:60:dd:47:ad:66 indus3:0 8,3
>>>>> 6) 00:60:dd:47:ad:76 indus3:1 8,3
>>>>> 7) 00:60:dd:47:ad:77 jhelum1:0 8,3
>>>>> 8) 00:60:dd:47:b3:5a ravi2:0 8,3
>>>>> 9) 00:60:dd:47:ad:5f ravi2:1 1,1
>>>>> 10) 00:60:dd:47:b3:bf ravi1:0 8,3
>>>>> ===================================================================
>>>>>
>>>>> ======
>>>>> *indus2*
>>>>> ======
>>>>>
>>>>> MX Version: 1.1.7rc3cvs1_1_fixes
>>>>> MX Build: @indus2:/opt/mx2g-1.1.7rc3 Thu May 31 11:24:03 PKT 2007
>>>>> 2 Myrinet boards installed.
>>>>> The MX driver is configured to support up to 4 instances and 1024 nodes.
>>>>> ===================================================================
>>>>> Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link up
>>>>> MAC Address: 00:60:dd:47:b3:ab
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 296636
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>> ROUTE
>>>>> COUNT
>>>>> INDEX MAC ADDRESS HOST NAME P0
>>>>> ----- ----------- --------- ---
>>>>> 0) 00:60:dd:47:b3:ab indus2:0 1,1
>>>>> 2) 00:60:dd:47:ad:68 indus4:0 1,1
>>>>> 3) 00:60:dd:47:b3:e8 indus4:1 8,3
>>>>> 4) 00:60:dd:47:ad:66 indus3:0 1,1
>>>>> 5) 00:60:dd:47:ad:76 indus3:1 7,3
>>>>> 6) 00:60:dd:47:ad:77 jhelum1:0 7,3
>>>>> 8) 00:60:dd:47:ad:7c indus1:0 8,3
>>>>> 9) 00:60:dd:47:b3:5a ravi2:0 8,3
>>>>> 10) 00:60:dd:47:ad:5f ravi2:1 8,3
>>>>> 11) 00:60:dd:47:b3:bf ravi1:0 7,3
>>>>> ===================================================================
>>>>> Instance #1: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link down
>>>>> MAC Address: 00:60:dd:47:b3:c3
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 296612
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>> ======
>>>>> *indus3*
>>>>> ======
>>>>> MX Version: 1.1.7rc3cvs1_1_fixes
>>>>> MX Build: @indus3:/opt/mx2g-1.1.7rc3 Thu May 31 11:29:03 PKT 2007
>>>>> 2 Myrinet boards installed.
>>>>> The MX driver is configured to support up to 4 instances and 1024 nodes.
>>>>> ===================================================================
>>>>> Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link up
>>>>> MAC Address: 00:60:dd:47:ad:66
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 297240
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>> ROUTE
>>>>> COUNT
>>>>> INDEX MAC ADDRESS HOST NAME P0
>>>>> ----- ----------- --------- ---
>>>>> 0) 00:60:dd:47:ad:66 indus3:0 1,1
>>>>> 1) 00:60:dd:47:ad:76 indus3:1 8,3
>>>>> 2) 00:60:dd:47:ad:68 indus4:0 1,1
>>>>> 3) 00:60:dd:47:b3:e8 indus4:1 6,3
>>>>> 4) 00:60:dd:47:ad:77 jhelum1:0 8,3
>>>>> 5) 00:60:dd:47:b3:ab indus2:0 1,1
>>>>> 7) 00:60:dd:47:ad:7c indus1:0 8,3
>>>>> 8) 00:60:dd:47:b3:5a ravi2:0 8,3
>>>>> 9) 00:60:dd:47:ad:5f ravi2:1 7,3
>>>>> 10) 00:60:dd:47:b3:bf ravi1:0 8,3
>>>>> ===================================================================
>>>>> Instance #1: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link up
>>>>> MAC Address: 00:60:dd:47:ad:76
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 297224
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>> ROUTE
>>>>> COUNT
>>>>> INDEX MAC ADDRESS HOST NAME P0
>>>>> ----- ----------- --------- ---
>>>>> 0) 00:60:dd:47:ad:66 indus3:0 8,3
>>>>> 1) 00:60:dd:47:ad:76 indus3:1 1,1
>>>>> 2) 00:60:dd:47:ad:68 indus4:0 7,3
>>>>> 3) 00:60:dd:47:b3:e8 indus4:1 1,1
>>>>> 4) 00:60:dd:47:ad:77 jhelum1:0 1,1
>>>>> 5) 00:60:dd:47:b3:ab indus2:0 7,3
>>>>> 7) 00:60:dd:47:ad:7c indus1:0 8,3
>>>>> 8) 00:60:dd:47:b3:5a ravi2:0 6,3
>>>>> 9) 00:60:dd:47:ad:5f ravi2:1 8,3
>>>>> 10) 00:60:dd:47:b3:bf ravi1:0 8,3
>>>>>
>>>>> ======
>>>>> *indus4*
>>>>> ======
>>>>>
>>>>> MX Version: 1.1.7rc3cvs1_1_fixes
>>>>> MX Build: @indus4:/opt/mx2g-1.1.7rc3 Thu May 31 11:36:59 PKT 2007
>>>>> 2 Myrinet boards installed.
>>>>> The MX driver is configured to support up to 4 instances and 1024 nodes.
>>>>> ===================================================================
>>>>> Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link up
>>>>> MAC Address: 00:60:dd:47:ad:68
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 297238
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>> ROUTE
>>>>> COUNT
>>>>> INDEX MAC ADDRESS HOST NAME P0
>>>>> ----- ----------- --------- ---
>>>>> 0) 00:60:dd:47:ad:68 indus4:0 1,1
>>>>> 1) 00:60:dd:47:b3:e8 indus4:1 7,3
>>>>> 2) 00:60:dd:47:ad:77 jhelum1:0 7,3
>>>>> 3) 00:60:dd:47:ad:66 indus3:0 1,1
>>>>> 4) 00:60:dd:47:ad:76 indus3:1 7,3
>>>>> 5) 00:60:dd:47:b3:ab indus2:0 1,1
>>>>> 7) 00:60:dd:47:ad:7c indus1:0 7,3
>>>>> 8) 00:60:dd:47:b3:5a ravi2:0 7,3
>>>>> 9) 00:60:dd:47:ad:5f ravi2:1 8,3
>>>>> 10) 00:60:dd:47:b3:bf ravi1:0 7,3
>>>>> ===================================================================
>>>>> Instance #1: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
>>>>> Status: Running, P0: Link up
>>>>> MAC Address: 00:60:dd:47:b3:e8
>>>>> Product code: M3F-PCIXF-2
>>>>> Part number: 09-03392
>>>>> Serial number: 296575
>>>>> Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
>>>>> Mapped hosts: 10
>>>>>
>>>>> ROUTE
>>>>> COUNT
>>>>> INDEX MAC ADDRESS HOST NAME P0
>>>>> ----- ----------- --------- ---
>>>>> 0) 00:60:dd:47:ad:68 indus4:0 6,3
>>>>> 1) 00:60:dd:47:b3:e8 indus4:1 1,1
>>>>> 2) 00:60:dd:47:ad:77 jhelum1:0 1,1
>>>>> 3) 00:60:dd:47:ad:66 indus3:0 8,3
>>>>> 4) 00:60:dd:47:ad:76 indus3:1 1,1
>>>>> 5) 00:60:dd:47:b3:ab indus2:0 8,3
>>>>> 7) 00:60:dd:47:ad:7c indus1:0 7,3
>>>>> 8) 00:60:dd:47:b3:5a ravi2:0 6,3
>>>>> 9) 00:60:dd:47:ad:5f ravi2:1 8,3
>>>>> 10) 00:60:dd:47:b3:bf ravi1:0 8,3
>>>>>
>>>>> The output from *ompi_info* is:
>>>>>
>>>>> Open MPI: 1.2.1r14096-ct7b030r1838
>>>>> Open MPI SVN revision: 0
>>>>> Open RTE: 1.2.1r14096-ct7b030r1838
>>>>> Open RTE SVN revision: 0
>>>>> OPAL: 1.2.1r14096-ct7b030r1838
>>>>> OPAL SVN revision: 0
>>>>> Prefix: /opt/SUNWhpc/HPC7.0
>>>>> Configured architecture: sparc-sun-solaris2.10
>>>>> Configured by: root
>>>>> Configured on: Fri Mar 30 12:49:36 EDT 2007
>>>>> Configure host: burpen-on10-0
>>>>> Built by: root
>>>>> Built on: Fri Mar 30 13:10:46 EDT 2007
>>>>> Built host: burpen-on10-0
>>>>> C bindings: yes
>>>>> C++ bindings: yes
>>>>> Fortran77 bindings: yes (all)
>>>>> Fortran90 bindings: yes
>>>>> Fortran90 bindings size: trivial
>>>>> C compiler: cc
>>>>> C compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/cc
>>>>> C++ compiler: CC
>>>>> C++ compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/CC
>>>>> Fortran77 compiler: f77
>>>>> Fortran77 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f77
>>>>> Fortran90 compiler: f95
>>>>> Fortran90 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f95
>>>>> C profiling: yes
>>>>> C++ profiling: yes
>>>>> Fortran77 profiling: yes
>>>>> Fortran90 profiling: yes
>>>>> C++ exceptions: yes
>>>>> Thread support: no
>>>>> Internal debug support: no
>>>>> MPI parameter check: runtime
>>>>> Memory profiling support: no
>>>>> Memory debugging support: no
>>>>> libltdl support: yes
>>>>> Heterogeneous support: yes
>>>>> mpirun default --prefix: yes
>>>>> MCA backtrace: printstack (MCA v1.0, API v1.0, Component
>>>>> v1.2.1)
>>>>> MCA paffinity: solaris (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA maffinity: first_use (MCA v1.0, API v1.0, Component
>>>>> v1.2.1)
>>>>> MCA timer: solaris (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>>>>> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
>>>>> MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA coll: self (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA io: romio (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA mpool: udapl (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA btl: mx (MCA v1.0, API v1.0.1, Component v1.2.1)
>>>>> MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.1)
>>>>> MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.1)
>>>>> MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
>>>>> MCA btl: udapl (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA mtl: mx (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.1)
>>>>> MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.1)
>>>>> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>>>>> MCA ras: dash_host (MCA v1.0, API v1.3, Component
>>>>> v1.2.1)
>>>>> MCA ras: gridengine (MCA v1.0, API v1.3, Component
>>>>> v1.2.1)
>>>>> MCA ras: localhost (MCA v1.0, API v1.3, Component
>>>>> v1.2.1)
>>>>> MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA rds: hostfile (MCA v1.0, API v1.3, Component
>>>>> v1.2.1) MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.1) MCA rds:
>>>>> resfile (MCA v1.0, API v1.3, Component v1.2.1) MCA rmaps: round_robin
>>>>> (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.1)
>>>>> MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.1)
>>>>> MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA pls: gridengine (MCA v1.0, API v1.3, Component
>>>>> v1.2.1)
>>>>> MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.1)
>>>>> MCA sds: env (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.1)
>>>>> MCA sds: singleton (MCA v1.0, API v1.0, Component
>>>>> v1.2.1)
>>>>>
>>>>> When I try to run a simple hello world program by issuing following
>>>>> command:
>>>>>
>>>>> *mpirun -np 4 -mca btl mx,sm,self -machinefile machines ./hello
>>>>>
>>>>> *The following error appears:
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>>>>> If you specified the use of a BTL component, you may have
>>>>> forgotten a component (such as "self") in the list of
>>>>> usable components.
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>>>> likely to abort. There are many reasons that a parallel process can
>>>>> fail during MPI_INIT; some of which are due to configuration or
>>>>> environment
>>>>> problems. This failure appears to be an internal failure; here's some
>>>>> additional information (which may only be relevant to an Open MPI
>>>>> developer):
>>>>>
>>>>> PML add procs failed
>>>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> *** An error occurred in MPI_Init
>>>>> *** before MPI was initialized
>>>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>>>>> If you specified the use of a BTL component, you may have
>>>>> forgotten a component (such as "self") in the list of
>>>>> usable components.
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>>>> likely to abort. There are many reasons that a parallel process can
>>>>> fail during MPI_INIT; some of which are due to configuration or
>>>>> environment
>>>>> problems. This failure appears to be an internal failure; here's some
>>>>> additional information (which may only be relevant to an Open MPI
>>>>> developer):
>>>>>
>>>>> PML add procs failed
>>>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> *** An error occurred in MPI_Init
>>>>> *** before MPI was initialized
>>>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
>>>>> If you specified the use of a BTL component, you may have
>>>>> forgotten a component (such as "self") in the list of
>>>>> usable components.
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>>>> likely to abort. There are many reasons that a parallel process can
>>>>> fail during MPI_INIT; some of which are due to configuration or
>>>>> environment
>>>>> problems. This failure appears to be an internal failure; here's some
>>>>> additional information (which may only be relevant to an Open MPI
>>>>> developer):
>>>>>
>>>>> PML add procs failed
>>>>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
>>>>> If you specified the use of a BTL component, you may have
>>>>> forgotten a component (such as "self") in the list of
>>>>> usable components.
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>>>> likely to abort. There are many reasons that a parallel process can
>>>>> fail during MPI_INIT; some of which are due to configuration or
>>>>> environment
>>>>> problems. This failure appears to be an internal failure; here's some
>>>>> additional information (which may only be relevant to an Open MPI
>>>>> developer):
>>>>>
>>>>> PML add procs failed
>>>>> --> Returned "Unreachable" (-*** An error occurred in MPI_Init
>>>>> *** before MPI was initialized
>>>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>> 12) instead of "Success" (0)
>>>>> ------------------------------------------------------------------------
>>>>> --
>>>>>
>>>>> *** An error occurred in MPI_Init
>>>>> *** before MPI was initialized
>>>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>
>>>>> The output from more */var/run/fms/fma.log*
>>>>>
>>>>> Sat Sep 22 10:47:50 2007 NIC 0: M3F-PCIXF-2 s/n=297218 1 ports, speed=2G
>>>>> Sat Sep 22 10:47:50 2007 mac = 00:60:dd:47:ad:7c
>>>>> Sat Sep 22 10:47:50 2007 NIC 1: M3F-PCIXF-2 s/n=297248 1 ports, speed=2G
>>>>> Sat Sep 22 10:47:50 2007 mac = 00:60:dd:47:ad:5e
>>>>> Sat Sep 22 10:47:50 2007 fms-1.2.1 fma starting
>>>>> Sat Sep 22 10:47:50 2007 Mapper was 00:00:00:00:00:00, l=0, is now
>>>>> 00:60:dd:47:ad:7c, l=1
>>>>> Sat Sep 22 10:47:50 2007 Mapping fabric...
>>>>> Sat Sep 22 10:47:54 2007 Mapper was 00:60:dd:47:ad:7c, l=1, is now
>>>>> 00:60:dd:47:b3:e8, l=1
>>>>> Sat Sep 22 10:47:54 2007 Cancelling mapping
>>>>> Sat Sep 22 10:47:59 2007 5 hosts, 8 nics, 6 xbars, 40 links
>>>>> Sat Sep 22 10:47:59 2007 map version is 1987557551
>>>>> Sat Sep 22 10:47:59 2007 Found NIC 0 at index 3!
>>>>> Sat Sep 22 10:47:59 2007 Found NIC 1 at index 2!
>>>>> Sat Sep 22 10:47:59 2007 map seems OK
>>>>> Sat Sep 22 10:47:59 2007 Routing took 0 seconds
>>>>> Mon Sep 24 14:26:46 2007 Requesting remap from indus4
>>>>> (00:60:dd:47:b3:e8): scouted by 00:60:dd:47:b3:5a, lev=1, pkt_type=0
>>>>> Mon Sep 24 14:26:51 2007 6 hosts, 10 nics, 6 xbars, 42 links
>>>>> Mon Sep 24 14:26:51 2007 map version is 1987557552
>>>>> Mon Sep 24 14:26:51 2007 Found NIC 0 at index 3!
>>>>> Mon Sep 24 14:26:51 2007 Found NIC 1 at index 2!
>>>>> Mon Sep 24 14:26:51 2007 map seems OK
>>>>> Mon Sep 24 14:26:51 2007 Routing took 0 seconds
>>>>> Mon Sep 24 14:35:17 2007 Requesting remap from indus4
>>>>> (00:60:dd:47:b3:e8): scouted by 00:60:dd:47:b3:bf, lev=1, pkt_type=0
>>>>> Mon Sep 24 14:35:19 2007 7 hosts, 11 nics, 6 xbars, 43 links
>>>>> Mon Sep 24 14:35:19 2007 map version is 1987557553
>>>>> Mon Sep 24 14:35:19 2007 Found NIC 0 at index 5!
>>>>> Mon Sep 24 14:35:19 2007 Found NIC 1 at index 4!
>>>>> Mon Sep 24 14:35:19 2007 map seems OK
>>>>> Mon Sep 24 14:35:19 2007 Routing took 0 seconds
>>>>> Tue Sep 25 21:47:52 2007 6 hosts, 9 nics, 6 xbars, 41 links
>>>>> Tue Sep 25 21:47:52 2007 map version is 1987557554
>>>>> Tue Sep 25 21:47:52 2007 Found NIC 0 at index 3!
>>>>> Tue Sep 25 21:47:52 2007 Found NIC 1 at index 2!
>>>>> Tue Sep 25 21:47:52 2007 map seems OK
>>>>> Tue Sep 25 21:47:52 2007 Routing took 0 seconds
>>>>> Tue Sep 25 21:52:02 2007 Requesting remap from indus4
>>>>> (00:60:dd:47:b3:e8): empty port x0p15 is no longer empty
>>>>> Tue Sep 25 21:52:07 2007 6 hosts, 10 nics, 6 xbars, 42 links
>>>>> Tue Sep 25 21:52:07 2007 map version is 1987557555
>>>>> Tue Sep 25 21:52:07 2007 Found NIC 0 at index 4!
>>>>> Tue Sep 25 21:52:07 2007 Found NIC 1 at index 3!
>>>>> Tue Sep 25 21:52:07 2007 map seems OK
>>>>> Tue Sep 25 21:52:07 2007 Routing took 0 seconds
>>>>> Tue Sep 25 21:52:23 2007 7 hosts, 11 nics, 6 xbars, 43 links
>>>>> Tue Sep 25 21:52:23 2007 map version is 1987557556
>>>>> Tue Sep 25 21:52:23 2007 Found NIC 0 at index 6!
>>>>> Tue Sep 25 21:52:23 2007 Found NIC 1 at index 5!
>>>>> Tue Sep 25 21:52:23 2007 map seems OK
>>>>> Tue Sep 25 21:52:23 2007 Routing took 0 seconds
>>>>> Wed Sep 26 05:07:01 2007 Requesting remap from indus4
>>>>> (00:60:dd:47:b3:e8): verify failed x1p2, nic 0, port 0 route=-9 4 10
>>>>> reply=-10 -4 9 , remote=ravi2 NIC
>>>>> 1, p0 mac=00:60:dd:47:ad:5f
>>>>> Wed Sep 26 05:07:06 2007 6 hosts, 9 nics, 6 xbars, 41 links
>>>>> Wed Sep 26 05:07:06 2007 map version is 1987557557
>>>>> Wed Sep 26 05:07:06 2007 Found NIC 0 at index 3!
>>>>> Wed Sep 26 05:07:06 2007 Found NIC 1 at index 2!
>>>>> Wed Sep 26 05:07:06 2007 map seems OK
>>>>> Wed Sep 26 05:07:06 2007 Routing took 0 seconds
>>>>> Wed Sep 26 05:11:19 2007 7 hosts, 11 nics, 6 xbars, 43 links
>>>>> Wed Sep 26 05:11:19 2007 map version is 1987557558
>>>>> Wed Sep 26 05:11:19 2007 Found NIC 0 at index 3!
>>>>> Wed Sep 26 05:11:19 2007 Found NIC 1 at index 2!
>>>>> Wed Sep 26 05:11:19 2007 map seems OK
>>>>> Wed Sep 26 05:11:19 2007 Routing took 0 seconds
>>>>> Thu Sep 27 11:45:37 2007 6 hosts, 9 nics, 6 xbars, 41 links
>>>>> Thu Sep 27 11:45:37 2007 map version is 1987557559
>>>>> Thu Sep 27 11:45:37 2007 Found NIC 0 at index 6!
>>>>> Thu Sep 27 11:45:37 2007 Found NIC 1 at index 5!
>>>>> Thu Sep 27 11:45:37 2007 map seems OK
>>>>> Thu Sep 27 11:45:37 2007 Routing took 0 seconds
>>>>> Thu Sep 27 11:51:02 2007 7 hosts, 11 nics, 6 xbars, 43 links
>>>>> Thu Sep 27 11:51:02 2007 map version is 1987557560
>>>>> Thu Sep 27 11:51:02 2007 Found NIC 0 at index 6!
>>>>> Thu Sep 27 11:51:02 2007 Found NIC 1 at index 5!
>>>>> Thu Sep 27 11:51:02 2007 map seems OK
>>>>> Thu Sep 27 11:51:02 2007 Routing took 0 seconds
>>>>> Fri Sep 28 13:27:10 2007 Requesting remap from indus4
>>>>> (00:60:dd:47:b3:e8): verify failed x5p0, nic 1, port 0 route=-8 15 6
>>>>> reply=-6 -15 8 , remote=ravi1 NIC
>>>>> 0, p0 mac=00:60:dd:47:b3:bf
>>>>> Fri Sep 28 13:27:24 2007 6 hosts, 8 nics, 6 xbars, 40 links
>>>>> Fri Sep 28 13:27:24 2007 map version is 1987557561
>>>>> Fri Sep 28 13:27:24 2007 Found NIC 0 at index 5!
>>>>> Fri Sep 28 13:27:24 2007 Cannot find NIC 1 (00:60:dd:47:ad:5e) in map!
>>>>> Fri Sep 28 13:27:24 2007 map seems OK
>>>>> Fri Sep 28 13:27:24 2007 Routing took 0 seconds
>>>>> Fri Sep 28 13:27:44 2007 7 hosts, 10 nics, 6 xbars, 42 links
>>>>> Fri Sep 28 13:27:44 2007 map version is 1987557562
>>>>> Fri Sep 28 13:27:44 2007 Found NIC 0 at index 7!
>>>>> Fri Sep 28 13:27:44 2007 Cannot find NIC 1 (00:60:dd:47:ad:5e) in map!
>>>>> Fri Sep 28 13:27:44 2007 map seems OK
>>>>> Fri Sep 28 13:27:44 2007 Routing took 0 seconds
>>>>>
>>>>> Do you have any suggestion or comments why this error appear and whats
>>>>> the solution to this problem. I have checked community mailing list for
>>>>> this problem and found few topics related to this, but could find any
>>>>> solution. Any suggestion or comments will be highly appreciated.
>>>>>
>>>>> The code that i m trying to run is given as follows:
>>>>>
>>>>> #include <stdio.h>
>>>>> #include "mpi.h"
>>>>> int main(int argc, char **argv)
>>>>> {
>>>>> int rank, size, tag, rc, i;
>>>>> MPI_Status status;
>>>>> char message[20];
>>>>> rc = MPI_Init(&argc, &argv);
>>>>> rc = MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>> rc = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>>> tag = 100;
>>>>> if(rank == 0) {
>>>>> strcpy(message, "Hello, world");
>>>>> for (i=1; i<size; i++)
>>>>> rc = MPI_Send(message, 13, MPI_CHAR, i, tag, MPI_COMM_WORLD);
>>>>> }
>>>>> else
>>>>> rc = MPI_Recv(message, 13, MPI_CHAR, 0, tag, MPI_COMM_WORLD,
>>>>> &status);
>>>>> printf( "node %d : %.13s\n", rank,message);
>>>>> rc = MPI_Finalize();
>>>>> return 0;
>>>>> }
>>>>>
>>>>> Thanks.
>>>>> Looking forward.
>>>>> Best regards,
>>>>> Hammad Siddiqi
>>>>> Center for High Performance Scientific Computing
>>>>> NUST Institute of Information Technology,
>>>>> National University of Sciences and Technology,
>>>>> Rawalpindi, Pakistan.
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
>
> --
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.