Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-05-17 21:59:12


On May 15, 2006, at 9:14 AM, Gurhan Ozen wrote:

> Jeff, George, Brian thanks for your inputs in this.
>
> I did "kind of" get openib working. Different revisions of kernel was
> running on both boxes, getting them running on the very same revisions
> of kernel and recompiling open-mpi with that rev. of kernel got me
> hello_world program running over openib stack.
>
> However, most MPI_* functions , such as MPI_Isend(), MPI_Barrier() are
> not working. For each one of them, i get the same error:
>
> [hostname:11992] *** An error occurred in MPI_Isend
> [hostname:11992] *** on communicator MPI_COMM_WORLD
> [hostname:11992] *** MPI_ERR_INTERN: internal error
> [hostname:11992] *** MPI_ERRORS_ARE_FATAL (goodbye)
>
> [hostname:11998] *** An error occurred in MPI_Barrier
> [hostname:11998] *** on communicator MPI_COMM_WORLD
> [hostname:11998] *** MPI_ERR_INTERN: internal error
> [hostname:11998] *** MPI_ERRORS_ARE_FATAL (goodby
>
> [hostname:01916] *** An error occurred in MPI_Send
> [hostname:01916] *** on communicator MPI_COMM_WORLD
> [hostname:01916] *** MPI_ERR_INTERN: internal error
> [hostname:01916] *** MPI_ERRORS_ARE_FATAL (goodbye)
>
> This is not just happening over network, but also locally. I am
> inclined to think that i miss some compilation flags or whatever.. I
> have tried this with openmpi-1.1a4 version as well , but kept on
> getting the same errors.
>
> Questions of the day:
> 1- Does anyone know why I might be getting this errors?

This generally means that there was no btl available to move data
between nodes. So I think you still have some issues with your
network setup (unfortunately, I'm not able to help here. George asked
for some debugging information that would be most helpful to us --
you might want to try getting that data with your current setup).

> 2- I couldn't find any "free" debuggers for debugging open-mpi
> programs, does anyone know of any? Are there any tricks to use gdb ,
> at least to debug locally running mpi programs?

The simple, dirty trick is to setup X11 forwarding with ssh and run:

   mpirun -np X -d xterm -e gdb <myapp>

You'll get a bunch of xterms open and can debug that way. It's
simple, it's cheap, but it definitely doesn't scale.

Brian

-- 
   Brian Barrett
   Open MPI developer
   http://www.open-mpi.org/