Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Receive operations hanging forever
From: Giovani (giovanifaccin_at_[hidden])
Date: 2008-03-13 18:50:08


Hello OpenMPI people!

I think that my openmpi install is somewhat strange. I'm simply
incapable of performing the simplest Recv operations.

I've installed openmpi using the default gentoo linux package. It
compiled without any problems. The version is sys-cluster/openmpi-1.2.5.

Now let's use the following program as a test:

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
#include <iostream>
#include "mpicxx.h"

using namespace std;

int main(int argc, char *argv[])
{
    MPI::Init();
   
    //If we are process 0:
    if ( MPI::COMM_WORLD.Get_rank() == 0 )
    {
        double d = 5;
        cout << "Starting to send data from node 0..." << endl;
        MPI::COMM_WORLD.Bcast( &d, 1, MPI::DOUBLE, 0);
        cout << "Finished to send data from node 0..." << endl;
    }
    //Else:
    else
    {
        MPI::Status mpi_status;
        double d = 0;
        cout << "Starting to receive data from node 0..." << endl;
        MPI::COMM_WORLD.Recv(&d, 1, MPI::DOUBLE, MPI::ANY_SOURCE,
MPI::ANY_TAG, mpi_status );
        cout << "Finished to receive data from node 0..." << endl;
    };
   
    MPI::COMM_WORLD.Barrier();
    MPI::Finalize();
}
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

I'm calling it with this command:
/usr/bin/mpirun --hostfile mpi-config.txt -np 3
/home/gfaccin/desenvolvimento/Eclipse/mpiplay/Debug/mpiplay

Where the hostfile mpi-config.txt contains the following line:
localhost slots=1

The slots thing is just to tell openmpi that I'm running it on a single
processor PC with oversubscribed nodes. Running the program without
using host files leads to the same results.

Once the program starts, I get this output:

Starting to send data from node 0...
Finished to send data from node 0...
Starting to receive data from node 0...
Starting to receive data from node 0...

And that's it. Processor usage goes to 100% and stays like that forever.
The output indicates that the Recv functions have hung.

I've tried to reinstall the package in case something is broken, but
nothing changed.

Would you have any clues on how can I fix this?

Thank you very much!

Giovani