Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Memchecker and Wait
From: Allen Barnett (allen_at_[hidden])
Date: 2009-08-11 22:56:46


Hi:
I'm trying to use the memchecker/valgrind capability of OpenMPI 1.3.3 to
help debug my MPI application. I noticed a rather odd thing: After
Waiting on a Recv Request, valgrind declares my receive buffer as
invalid memory. Is this just a fluke of valgrind, or is OMPI doing
something internally?

This is on a 64-bit RHEL 5 system using GCC 4.3.2 and Valgrind 3.4.1.

Here is an example:
----------------------------------------------------------
#include <stdio.h>
#include <string.h>
#include "mpi.h"

int main(int argc, char *argv[])
{
  int rank, size;

  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  if ( size != 2 ) {
    if ( rank == 0 )
      printf("Please run with 2 processes.\n");
    MPI_Finalize();
    return 1;
  }

  if (rank == 0) {
    char buffer_in[100];
    MPI_Request req_in;
    MPI_Status status;
    memset( buffer_in, 1, sizeof(buffer_in) );
    MPI_Recv_init( buffer_in, 100, MPI_CHAR, 1, 123, MPI_COMM_WORLD,
&req_in );
    MPI_Start( &req_in );
    printf( "Before wait: %p: %d\n", buffer_in, buffer_in[3] );
    printf( "Before wait: %p: %d\n", buffer_in, buffer_in[4] );
    MPI_Wait( &req_in, &status );
    printf( "After wait: %p: %d\n", buffer_in, buffer_in[3] );
    printf( "After wait: %p: %d\n", buffer_in, buffer_in[4] );
    MPI_Request_free( &req_in );
  }
  else {
    char buffer_out[100];
    memset( buffer_out, 2, sizeof(buffer_out) );
    MPI_Send( buffer_out, 100, MPI_CHAR, 0, 123, MPI_COMM_WORLD );
  }
  
  MPI_Finalize();
  return 0;
}
----------------------------------------------------------

Doing "mpirun -np 2 -mca btl ^sm valgrind ./a.out" yields:

Before wait: 0x7ff0003b0: 1
Before wait: 0x7ff0003b0: 1
==15487==
==15487== Invalid read of size 1
==15487== at 0x400C6B: main (waittest.c:30)
==15487== Address 0x7ff0003b3 is on thread 1's stack
After wait: 0x7ff0003b0: 2
==15487==
==15487== Invalid read of size 1
==15487== at 0x400C8B: main (waittest.c:31)
==15487== Address 0x7ff0003b4 is on thread 1's stack
After wait: 0x7ff0003b0: 2

Also, if I run this program with the shared memory BTL active, valgrind
reports several "conditional jump or move depends on uninitialized
value"s in the SM BTL and about 24k lost bytes at the end (mostly from
allocations in MPI_Init).

Thanks,
Allen

-- 
Allen Barnett
Transpire, Inc
E-Mail: allen_at_[hidden]
Skype:  allenbarnett