Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Questions about MPI_Isend
From: Gijsbert Wiesenekker (gijsbert.wiesenekker_at_[hidden])
Date: 2010-05-11 16:54:04


On May 11, 2010, at 9:18 , Gijsbert Wiesenekker wrote:

> An OpenMPI program of mine that uses MPI_Isend and MPI_Irecv crashes after some non-reproducible time my Fedora Linux kernel (invalid opcode), which makes it hard to debug (there is no trace, even with the debug kernel, and if I run it under valgrind it does not crash).
> My guess is that the kernel crash is caused by OpenMPI running out if memory because too many MPI_Irecv messages have been sent but not been processed yet.
> My questions are:
> What does the OpenMPI specification say about the behaviour of MPI_Isend when many messages have been sent but have not been processed yet? Will it fail? Will it block until more memory becomes available (I hope not, because this would cause my program to deadlock)?
> Ideally I would like to check how many MPI_Isend messages have not been processed yet, so that I can stop sending messages if there are 'too many' waiting. Is there a way to do this?
>
> Regards,
> Gijsbert
>

I want to let you know that this crash (you get invalid opcode: 0000 [1] SMP painted on your screen) is specific for Fedora 12 kernel version 2.6.32.11-99.fc12.x86_64, OpenMPI 1.4.2, a lot of MPI_Isend and MPI_Irecv calls and perhaps my hardware. The same code on CentOS 5.4 kernel version 2.6.18-164.15.1.el5 runs fine.

Gijsbert