Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Open MPI 1.7.4 with --enable-mpi-thread-multiple gives MPI_Recv error
From: Elias Rudberg (elias.rudberg_at_[hidden])
Date: 2014-03-16 20:04:36


Hi Ralph,

Thanks for the quick answer!

> Try running the "ring" program in our example directory and see if that works

I just did this, and it works. (I ran ring_c.c)

Looking in your ring_c.c code, I see that it is quite similar to my
test program but one thing that differs is the datatype: the ring
program uses MPI_INT but my test uses MPI_CHARACTER.
I tried changing from MPI_INT to MPI_CHARACTER in ring_c.c (and the
type of the variable "message" from int to char), and then ring_c.c
fails in the same way as my test code. And my code works if changing
from MPI_CHARACTER to MPI_INT.

So, it looks like the there is a bug that is triggered when using
MPI_CHARACTER, but it works with MPI_INT.

/ Elias

Quoting Ralph Castain <rhc_at_[hidden]>:

> Try running the "ring" program in our example directory and see if that works
>
> On Mar 16, 2014, at 4:26 PM, Elias Rudberg <elias.rudberg_at_[hidden]> wrote:
>
>> Hello!
>>
>> I would like to report a bug in Open MPI 1.7.4 when compiled with
>> --enable-mpi-thread-multiple.
>>
>> The bug can be reproduced with the following test program (mpi-send-recv.c):
>> ===========================================
>> #include <mpi.h>
>> #include <stdio.h>
>> int main() {
>> MPI_Init(NULL, NULL);
>> int rank;
>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> printf("Rank %d at start\n", rank);
>> if (rank)
>> MPI_Send(NULL, 0, MPI_CHARACTER, 0, 0, MPI_COMM_WORLD);
>> else
>> MPI_Recv(NULL, 0, MPI_CHARACTER, 1, 0, MPI_COMM_WORLD,
>> MPI_STATUS_IGNORE);
>> printf("Rank %d at end\n", rank);
>> MPI_Finalize();
>> return 0;
>> }
>> ===========================================
>>
>> With Open MPI 1.7.4 compiled with --enable-mpi-thread-multiple, the
>> test program above fails like this:
>> $ mpirun -np 2 ./a.out
>> Rank 0 at start
>> Rank 1 at start
>> [elias-p6-2022scm:2743] *** An error occurred in MPI_Recv
>> [elias-p6-2022scm:2743] *** reported by process
>> [140733606985729,140256452018176]
>> [elias-p6-2022scm:2743] *** on communicator MPI_COMM_WORLD
>> [elias-p6-2022scm:2743] *** MPI_ERR_TYPE: invalid datatype
>> [elias-p6-2022scm:2743] *** MPI_ERRORS_ARE_FATAL (processes in this
>> communicator will now abort,
>> [elias-p6-2022scm:2743] *** and potentially your MPI job)
>>
>> Steps I use to reproduce this in Ubuntu:
>>
>> (1) Download openmpi-1.7.4.tar.gz
>>
>> (2) Configure like this:
>> ./configure --enable-mpi-thread-multiple
>>
>> (3) make
>>
>> (4) Compile test program like this:
>> mpicc mpi-send-recv.c
>>
>> (5) Run like this:
>> mpirun -np 2 ./a.out
>> This gives the error above.
>>
>> Of course, in my actual application I will want to call
>> MPI_Init_thread with MPI_THREAD_MULTIPLE instead of just MPI_Init,
>> but that does not seem to matter for this error; the same error
>> comes regardless of the way I call MPI_Init/MPI_Init_thread. So I
>> just put MPI_Init in the test code above to make it as short as
>> possible.
>>
>> Do you agree that this is a bug, or am I doing something wrong?
>>
>> Any ideas for workarounds to make things work with
>> --enable-mpi-thread-multiple? (I do need threads, so skipping
>> --enable-mpi-thread-multiple is probably not an option for me.)
>>
>> Best regards,
>> Elias
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>