Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mca_oob_tcp_msg_recv: readv failed:Unknown error (10054)
From: hi (hiralsmaillist_at_[hidden])
Date: 2011-04-19 09:52:17


Hi Ralph,

The Fortran code snippet is as follow...

<<<
      ...
      write(*,*) "calling blacs_pinfo()..."
      CALL BLACS_PINFO(IAM, NPROCS)
      write(*,*), "after blacs_pinfo()..."
      write(*,*), "IAM=", IAM
      write(*,*), "NPROCS=", NPROCS
      ...
>>>

As you can notice, below-mentioned crash is happening in call to BLACS_PINFO().

I am using following environment:
 OS: Windows 7 64-bit
 Compilers: Visual Studio 2008 32bit and Intel ifort 32bit
 OpenMPI: OpenMPI-1.5.3 pre-built libraries and also with
 OpenMPI-1.5.2. locally built libraries
 BLACS: pre-built libraries taken from
http://icl.cs.utk.edu/lapack-for-windows/scalapack/index.html#librairies

Thank you.
-Hiral

On Tue, Apr 19, 2011 at 6:56 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> Just a suggestion: have you looked at it in a debugger? The error isn't coming from OMPI - looks like a segfault caused by an error in the program or how it is being run.
>
>
> On Apr 19, 2011, at 7:19 AM, hi wrote:
>
>> On WINDOWS platform, I am observing following error when executing
>> "mpirun blacs_hello_example.exe" (example program to test BLACS taken
>> from http://www.netlib.org/blacs/BLACS/Examples.html#HELLO)...
>>
>> C:\blacs_examples> mpirun blacs_hello_example.exe
>> calling blacs_pinfo()...
>> forrtl: severe (157): Program Exception - access violation
>> Image              PC        Routine            Line        Source
>> libmpid.dll        6A8E2DC5  Unknown               Unknown  Unknown
>> libmpid.dll        6A8E2C31  Unknown               Unknown  Unknown
>> blacs_ex01.exe     00402357  Unknown               Unknown  Unknown
>> libifcorert.dll    1002A1C1  Unknown               Unknown  Unknown
>> [myhost1:15340] [[30379,0],0]-[[30379,1],0] mca_oob_tcp_msg_recv:
>> readv failed:Unknown error (10054)
>> --------------------------------------------------------------------------
>> mpirun.exe has exited due to process rank 0 with PID 528 on node
>> vibgyor exiting improperly. There are two reasons this could occur:
>>
>> 1. this process did not call "init" before exiting, but others in the
>> job did. This can cause a job to hang indefinitely while it waits for
>> all processes to call "init". By rule, if one process calls "init",
>> then ALL processes must call "init" prior to termination.
>>
>> 2. this process called "init", but exited without calling "finalize".
>> By rule, all processes that call "init" MUST call "finalize" prior to
>> exiting or it will be considered an "abnormal termination"
>>
>> This may have caused other processes in the application to be
>> terminated by signals sent by mpirun.exe (as reported here).
>> --------------------------------------------------------------------------
>>
>> Environment:
>> OS: Windows 7 64-bit
>> Compilers: Visual Studio 2008 32bit and Intel ifort 32bit
>> OpenMPI: OpenMPI-1.5.3 pre-built libraries and also with
>> OpenMPI-1.5.2. locally built libraries
>> BLACS: pre-built libraries taken from
>> http://icl.cs.utk.edu/lapack-for-windows/scalapack/index.html#librairies
>>
>> Any idea on how to resolve this???
>>
>> Thank you in advance.
>> -Hiral
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>