Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mca_oob_tcp_msg_recv: readv failed:Unknown error (10054)
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-04-19 10:01:42


On Apr 19, 2011, at 7:52 AM, hi wrote:

> Hi Ralph,
>
> The Fortran code snippet is as follow...
>
> <<<
> ...
> write(*,*) "calling blacs_pinfo()..."
> CALL BLACS_PINFO(IAM, NPROCS)
> write(*,*), "after blacs_pinfo()..."
> write(*,*), "IAM=", IAM
> write(*,*), "NPROCS=", NPROCS
> ...
>>>>
>
> As you can notice, below-mentioned crash is happening in call to BLACS_PINFO().

This was my point - BLACS_PINFO has nothing to do with OMPI, so I don't know how we could help you debug it. It sounds like you have a bad blacs library. You might try ensuring it is installed correctly, you have a "good" version built from a compatible compiler, etc.

>
> I am using following environment:
> OS: Windows 7 64-bit
> Compilers: Visual Studio 2008 32bit and Intel ifort 32bit
> OpenMPI: OpenMPI-1.5.3 pre-built libraries and also with
> OpenMPI-1.5.2. locally built libraries
> BLACS: pre-built libraries taken from
> http://icl.cs.utk.edu/lapack-for-windows/scalapack/index.html#librairies
>
> Thank you.
> -Hiral
>
> On Tue, Apr 19, 2011 at 6:56 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>> Just a suggestion: have you looked at it in a debugger? The error isn't coming from OMPI - looks like a segfault caused by an error in the program or how it is being run.
>>
>>
>> On Apr 19, 2011, at 7:19 AM, hi wrote:
>>
>>> On WINDOWS platform, I am observing following error when executing
>>> "mpirun blacs_hello_example.exe" (example program to test BLACS taken
>>> from http://www.netlib.org/blacs/BLACS/Examples.html#HELLO)...
>>>
>>> C:\blacs_examples> mpirun blacs_hello_example.exe
>>> calling blacs_pinfo()...
>>> forrtl: severe (157): Program Exception - access violation
>>> Image PC Routine Line Source
>>> libmpid.dll 6A8E2DC5 Unknown Unknown Unknown
>>> libmpid.dll 6A8E2C31 Unknown Unknown Unknown
>>> blacs_ex01.exe 00402357 Unknown Unknown Unknown
>>> libifcorert.dll 1002A1C1 Unknown Unknown Unknown
>>> [myhost1:15340] [[30379,0],0]-[[30379,1],0] mca_oob_tcp_msg_recv:
>>> readv failed:Unknown error (10054)
>>> --------------------------------------------------------------------------
>>> mpirun.exe has exited due to process rank 0 with PID 528 on node
>>> vibgyor exiting improperly. There are two reasons this could occur:
>>>
>>> 1. this process did not call "init" before exiting, but others in the
>>> job did. This can cause a job to hang indefinitely while it waits for
>>> all processes to call "init". By rule, if one process calls "init",
>>> then ALL processes must call "init" prior to termination.
>>>
>>> 2. this process called "init", but exited without calling "finalize".
>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>> exiting or it will be considered an "abnormal termination"
>>>
>>> This may have caused other processes in the application to be
>>> terminated by signals sent by mpirun.exe (as reported here).
>>> --------------------------------------------------------------------------
>>>
>>> Environment:
>>> OS: Windows 7 64-bit
>>> Compilers: Visual Studio 2008 32bit and Intel ifort 32bit
>>> OpenMPI: OpenMPI-1.5.3 pre-built libraries and also with
>>> OpenMPI-1.5.2. locally built libraries
>>> BLACS: pre-built libraries taken from
>>> http://icl.cs.utk.edu/lapack-for-windows/scalapack/index.html#librairies
>>>
>>> Any idea on how to resolve this???
>>>
>>> Thank you in advance.
>>> -Hiral
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users