Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] SIGPIPE handling?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-12-01 16:28:10


On Dec 1, 2010, at 4:12 PM, Jesse Ziser wrote:

> Sorry, one more question: I don't completely understand the version numbering, but can/will this fix go into 1.5.1 at some point? I notice that the trunk is labeled as 1.7.

Here's an explanation of our version numbering:

    http://www.open-mpi.org/software/ompi/versions/

Short version is:

- v1.4: our "super stable" / mature series. Someday it will be retired.
- v1.5: our "feature" series -- not quite as mature as the v1.4 series. Someday it will transition to be the next "super stable" series: v1.6.
- SVN development trunk/v1.7: what will eventually become the v1.7 series (i.e., our next "feature" series).

So v1.5 is an official release series. But it's still under active development and having features added. v1.4 is only having bug fixes applied to it -- it's in the stable/production portion of its lifespan.

> Thanks again
>
> Jesse Ziser wrote:
>> It turned out I was using development version 1.5.0. After going back to the release version, I found that there was another problem on my end, which had nothing to do with OpenMPI. So thanks for the help; all is well. (And sorry for the belated reply.)
>> Ralph Castain wrote:
>>> After digging around a little, I found that you must be using the OMPI devel trunk as no release version contains this code. I also looked to see why it was done, and found that the concern was with an inadvertent sigpipe that can occur internal to OMPI due to a race condition.
>>>
>>> So I modified the trunk a little. We will ignore the first few sigpipe errors we get, but will then abort with an appropriate error.
>>>
>>> HTH
>>> Ralph
>>>
>>> On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote:
>>>
>>>> Hello,
>>>>
>>>> I've noticed that OpenMPI does not seem to detect when something downstream of it fails. Specifically, I think it does not handle SIGPIPE or pass it down to its young, but it still prints an error message every time it occurs.
>>>>
>>>> For example, running a command like this:
>>>>
>>>> mpirun -np 1 ./mpi-cat </dev/zero | dd bs=1 count=1 >/dev/null
>>>>
>>>> (where mpi-cat is just a simple program that initializes MPI and then copies its input to its output) hangs after the dd quits, and produces an eternity of repetitions of this error message:
>>>>
>>>> [[35845,0],0] reports a SIGPIPE error on fd 13
>>>>
>>>> I am unsure whether this is the intended behavior, but it certainly seems unfortunate from my persepective. Is there any way to make it exit nicely, preferably with a single error, whenever what it's trying to write to doesn't exist anymore? I think I could even submit a patch to make it quit on SIGPIPE, if it is agreed that that makes sense.
>>>>
>>>> Here's the source for my mpi-cat example:
>>>>
>>>> #include <stdio.h>
>>>>
>>>> #include <mpi.h>
>>>>
>>>> int main (int iArgC, char *apArgV [])
>>>> {
>>>> int iRank;
>>>>
>>>> MPI_Init (&iArgC, &apArgV);
>>>>
>>>> MPI_Comm_rank (MPI_COMM_WORLD, &iRank);
>>>>
>>>> if (iRank == 0)
>>>> {
>>>> while(1)
>>>> if(putchar(getchar()) < 0)
>>>> break;
>>>> }
>>>>
>>>> MPI_Finalize ();
>>>>
>>>> return (0);
>>>> }
>>>>
>>>>
>>>> Thank you,
>>>>
>>>> Jesse Ziser
>>>> Applied Research Laboratories:
>>>> The University of Texas at Austin
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/