Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brunner, Thomas A. (tabrunn_at_[hidden])
Date: 2006-04-03 10:52:02


This seems to fix my problem. I have also decided that my original
production code that exposed this was a little dangerous, and has also been
improved because of this. I didn't realize that MPI_UNDEFINED was returned,
and I was relying on it being zero or less, which it happens to be.

Thanks for your help,
Tom

On 4/1/06 1:44 AM, "George Bosilca" <bosilca_at_[hidden]> wrote:

> There seems to be a sentence in the MPI standard about this case. The
> standard state:
>
> If there is no active handle in the list it returns outcount =
> MPI_UNDEFINED.
>
> Revision 9513 follow the standard.
>
> Thanks,
> george.
>
>
> On Mar 31, 2006, at 6:38 PM, Brunner, Thomas A. wrote:
>
>> Compiling revision 9505 of the trunk and building my original test
>> code now
>> core dumps. I can run the test code with the Testsome line
>> commented out.
>> Here is the output from a brief gdb session:
>>
>> --------------------------------------------------------------
>>
>> gdb a.out /cores/core.28141
>> GNU gdb 6.1-20040303 (Apple version gdb-437) (Sun Dec 25 08:31:29
>> GMT 2005)
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License,
>> and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for
>> details.
>> This GDB was configured as "powerpc-apple-darwin"...Reading symbols
>> for
>> shared libraries ..... done
>>
>> Core was generated by `a.out'.
>> #0 0x010b2a90 in ?? ()
>> (gdb) bt
>> #0 0x010b2a90 in ?? ()
>> #1 0x010b2a3c in ?? ()
>> warning: Previous frame identical to this frame (corrupt stack?)
>> #2 0x00002c18 in grow_table (table=0x1, soft=3221222188, hard=0) at
>> class/ompi_pointer_array.c:352
>> (gdb) up
>> #1 0x010b2a3c in ?? ()
>> (gdb) up
>> #2 0x00002c18 in grow_table (table=0x1, soft=3221222188, hard=0) at
>> class/ompi_pointer_array.c:352
>> 352 if (table->size >= OMPI_FORTRAN_HANDLE_MAX) {
>>
>> ---------------------------------------------------------------
>> This is the output from the code.
>>
>> Hello from processor 0 of 1
>> Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
>> Failing at addr:0x0
>> *** End of error message ***
>> ------------------------------------------------------------
>>
>> Perhaps in the MPI_Wait* and MPI_Test* functions, if incount==0, then
>> *outcount should be set to zero and immediately return? (Of course
>> checking
>> that outcount !=0 too.)
>>
>> Tom
>>
>>
>>
>> On 3/31/06 1:35 PM, "George Bosilca" <bosilca_at_[hidden]> wrote:
>>
>>> When we're checking the arguments, we check for the request array to
>>> not be NULL without looking to the number of requests. I think it
>>> make sense, as I don't see why the user would call these functions
>>> with 0 requests ... But, the other way around make sense too. As I
>>> don't find anything in the MPI standard that stop the user doing that
>>> I add the additional check to all MPI_Wait* and MPI_Test* functions.
>>>
>>> Please get the version from trunk after revision 9504.
>>>
>>> Thanks,
>>> george.
>>>
>>> On Mar 31, 2006, at 2:56 PM, Brunner, Thomas A. wrote:
>>>
>>>>
>>>> I have an algorithm that collects information in a tree like manner
>>>> using
>>>> nonblocking communication. Some nodes do not receive information
>>>> from other
>>>> nodes, so there are no outstanding requests on those nodes. On all
>>>> processors, I check for the incoming messages using MPI_Testsome().
>>>> MPI_Testsome fails with OpenMPI, however if the request length is
>>>> zero.
>>>> Here is a code that can be run with only one processor that shows
>>>> the same
>>>> behavior:
>>>>
>>>> ///////////////////////////////////////////
>>>>
>>>> #include "mpi.h"
>>>> #include <stdio.h>
>>>>
>>>> int main( int argc, char *argv[])
>>>> {
>>>> int myid, numprocs;
>>>>
>>>> MPI_Init(&argc,&argv);
>>>> MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
>>>> MPI_Comm_rank(MPI_COMM_WORLD,&myid);
>>>>
>>>> printf("Hello from processor %i of %i\n", myid, numprocs);
>>>>
>>>> int size = 0;
>>>> int num_done = 0;
>>>> MPI_Status* stat = 0;
>>>> MPI_Request* req = 0;
>>>> int* done_indices = 0;
>>>>
>>>> MPI_Testsome( size, req, &num_done, done_indices, stat);
>>>>
>>>> printf("Finalizing on processor %i of %i\n", myid, numprocs);
>>>>
>>>> MPI_Finalize();
>>>>
>>>> return 0;
>>>> }
>>>>
>>>> /////////////////////////////////////////
>>>>
>>>> The output using OpenMPI is:
>>>>
>>>> Hello from processor 0 of 1
>>>> [mymachine:09115] *** An error occurred in MPI_Testsome
>>>> [mymachine:09115] *** on communicator MPI_COMM_WORLD
>>>> [mymachine:09115] *** MPI_ERR_REQUEST: invalid request
>>>> [mymachine:09115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>> 1 process killed (possibly by Open MPI)
>>>>
>>>>
>>>> Many other MPI implementations support this, and reading the
>>>> standard, it
>>>> seems like it should be OK.
>>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>> <config.log.bz2>
>>>> <testsome_test.out>
>>>> <testsome_test.c>
>>>> <ompi_info.out>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>