Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] RE : Bug when mixing sent types in version 1.6
From: BOUVIER Benjamin (benjamin.bouvier_at_[hidden])
Date: 2012-06-08 11:51:23


Hi Jeff,

Thanks for your answer.

I have downloaded the Netpipe benchmarks suite, launched `make mpi` and launched with mpirun the resulting executable.

Here is an interesting fact : by launching this executable on 2 nodes, it works ; on 3 nodes, it blocks, I guess on connect.
Each process is running on a core, on each machine, using 100% of one CPU ; but nothing else happens. I have to kill the program to quit.
Setting the option -mca btl_base_verbose to 30 shows me that the last thing tried by each node is to connect to other nodes.

May it be a network issue ?

Thanks,

--
Benjamin Bouvier
________________________________________
De : users-bounces_at_[hidden] [users-bounces_at_[hidden]] de la part de Jeff Squyres [jsquyres_at_[hidden]]
Date d'envoi : vendredi 8 juin 2012 16:30
À : Open MPI Users
Objet : Re: [OMPI users] Bug when mixing sent types in version 1.6
On Jun 8, 2012, at 6:43 AM, BOUVIER Benjamin wrote:
> # include <mpi.h>
> # include <stdio.h>
> # include <string.h>
>
> int main(int argc, char **argv)
> {
>    int rank, size;
>    const char someString[] = "Can haz cheezburgerz?";
>
>    MPI_Init(&argc, &argv);
>
>    MPI_Comm_rank( MPI_COMM_WORLD, & rank );
>    MPI_Comm_size( MPI_COMM_WORLD, & size );
>
>    if ( rank == 0 )
>    {
>        int len = strlen( someString );
>        int i;
>        for( i = 1; i < size; ++i)
>        {
>            MPI_Send( &len, 1, MPI_INT, i, 0, MPI_COMM_WORLD );
>            MPI_Send( &someString, len+1, MPI_CHAR, i, 0, MPI_COMM_WORLD );
>        }
>    } else {
>        char buffer[ 128 ];
>        int receivedLen;
>        MPI_Status stat;
>        MPI_Recv( &receivedLen, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &stat );
>        printf( "[Worker] Length : %d\n", receivedLen );
>        MPI_Recv( buffer, receivedLen+1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &stat);
>        printf( "[Worker] String : %s\n", buffer );
>    }
>
>    MPI_Finalize();
> }
I don't see anything obviously wrong with this code.
> I know that there is a better way to send a string, by giving a maximum buffer size at the second MPI_Recv, but there is no the main topic here.
> The launch works locally (i.e when the 2 processes are launched on one machine), but doesn't work when the 2 processes are dispatched in 2 machines through network (i.e one per host). In this case, the worker correctly reads the INT, and then master and worker block on the next call.
That's very odd.
> I have no issue when sending only char strings or only numbers. This only happens when sending char strings then numbers, or in the other order.
That's even more odd.
Can you run standard benchmarks like MPI net pipe, and/or the OSU benchmarks?  (across multiple nodes, that is)
--
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users