Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jose Pedro Garcia Mahedero (jpgmahedero_at_[hidden])
Date: 2006-03-01 06:07:17


You're right, I'll try to use netpipes first and then the application. If
it doesn't workt I'll send configs and more detailed informations

Thank you!

On 3/1/06, Brian Barrett <brbarret_at_[hidden]> wrote:
>
> Jose -
>
> I noticed that your output doesn't appear to match what the source
> code is capable of generating. It's possible that you're running
> into problems with the code that we can't see because you didn't send
> a complete version of the source code.
>
> You might want to start by running some 3rd party codes that are
> known to be good, just to make sure that your MPI installation checks
> out. A good start is NetPIPE, which runs between two peers and gives
> latency / bandwidth information. If that runs, then it's time to
> look at your application. If that doesn't run, then it's time to
> look at the MPI installation in more detail. In this case, it would
> be useful to see all of the information requested here:
>
> http://www.open-mpi.org/community/help/
>
> as well as from running the mpirun command used to start NetPIPE with
> the -d option, so something like:
>
> mpirun -np 2 -hostfile foo -d ./NPMpi
>
> Brian
>
> On Feb 28, 2006, at 9:29 AM, Jose Pedro Garcia Mahedero wrote:
>
> > Hello everybody.
> >
> > I'm new to MPI and I'm having some problems while runnig a simple
> > pingpong program in more than one node.
> >
> > 1.- I followed all the instructions and installed open MPI without
> > problems in a Beowulf cluster.
> > 2.- Ths cluster is working OK and ssh keys are set for not
> > password prompting
> > 3.- miexec seems to run OK.
> > 4.- Now I'm using just 2 nodes: I've tried a simple ping-pong
> > application but my master only sends one request!!
> > 5.- I reduced the problem by trying to send just two mesages to the
> > same node:
> >
> > int main(int argc, char **argv){
> > int myrank;
> >
> > /* Initialize MPI */
> >
> > MPI_Init(&argc, &argv);
> >
> > /* Find out my identity in the default communicator */
> >
> > MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
> > if (myrank == 0) {
> > int work = 100;
> > int count=0;
> > for (int i =0; i < 10; i++){
> > cout << "MASTER IS SLEEPING..." << endl;
> > sleep(3);
> > cout << "MASTER AWAKE WILL SEND["<< count++ << "]:" << work
> > << endl;
> > MPI_Send(&work, 1, MPI_INT, 1, WORKTAG, MPI_COMM_WORLD);
> > }
> > } else {
> > int count =0;
> > int work;
> > MPI_Status status;
> > while (true){
> > MPI_Recv(&work, 1, MPI_INT, 0, MPI_ANY_TAG,
> > MPI_COMM_WORLD, &status);
> > cout << "SLAVE[" << myrank << "] RECEIVED[" << count++ <<
> > "]:" << work <<endl;
> > if (status.MPI_TAG == DIETAG) {
> > break;
> > }
> > }// while
> > }
> > MPI_Finalize();
> >
> >
> >
> > 6a.- RESULTS (if I put more than one machine in my mpihostsfile),
> > my master sends the first message and my slave receives it
> > perfectly. But my master doesnt send its second .
> > message:
> >
> >
> >
> > Here's my output
> >
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[0]:100
> > MASTER IS SLEEPING...
> > SLAVE[1] RECEIVED[0]:100MPI_STATUS.MPI_ERROR:0
> > MASTER AWAKE WILL SEND[1]:100
> >
> > 6b.- RESULTS (if I put ONLY 1 machine in my mpihostsfile),
> > everything is OK until iteration 9!!!
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[0]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[1]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[2]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[3]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[4]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[5]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[6]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[7]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[8]:100
> > MASTER IS SLEEPING...
> > MASTER AWAKE WILL SEND[9]:100
> > SLAVE[1] RECEIVED[0]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[1]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[2]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[3]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[4]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[5]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[6]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[7]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[8]:100MPI_STATUS.MPI_ERROR:0
> > SLAVE[1] RECEIVED[9]:100MPI_STATUS.MPI_ERROR:0
> > --------------------------------
> >
> > I know this is a lot of text, but I wanted to give a mamixum
> > detailed question. I've been search in FAQ, but still don't know
> > what (and why) is going on...
> >
> > Anyone can help please :-) ?
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> --
> Brian Barrett
> Open MPI developer
> http://www.open-mpi.org/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>