Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2007-01-02 13:09:43


First you should make sure that PATH and LD_LIBRARY_PATH are defined
in the section of your .bashrc file that get parsed for non
interactive sessions. Run "mpirun -np 1 printenv" and check if PATH
and LD_LIBRARY_PATH have the values you expect.

For your second question you should give the path to your prueba.bin
executable. I'll do something like "mpirun --prefix /usr/local/
openmpi -np 2 ./prueba.bin". The reason is that usually "." is not in
the PATH.

   george.

On Jan 2, 2007, at 11:20 AM, jcolmenares_at_[hidden] wrote:

> I installed openmpi 1.1.2 on two 686 boxes runing ubuntu 6.10.
> Followed the instructions given in the FAQ. Nevertheless, I get the
> following message:
>
> [bernie-1:05053] ERROR: A daemon on node 192.168.1.113 failed to
> start as
> expected.
> [bernie-1:05053] ERROR: There may be more information available from
> [bernie-1:05053] ERROR: the remote shell (see above).
> [bernie-1:05053] ERROR: The daemon exited unexpectedly with status
> 127.
>
> now, I've been browsing the web, including the mailing lists, and it
> appears that the error should be that I have not declared the
> variables
>
> export PATH="/usr/local/openmpi/bin:${PATH}"
> export LD_LIBRARY_PATH="/usr/local/openmpi/lib:${LD_LIBRARY_PATH}"
>
> at the node, wich I have. I have even created all the posible folders
> proposed at the FAQ for remote loggins, although I'm using bash.
>
> If I do a ssh user_at_remote_node, I can connect without being asked
> for a
> password, and if I type mpif90, I get: "gfortran: no input files",
> wich
> should mean that indeed the PATH and LD_LIBRARY_PATH are being
> updated on
> the remote logging.
>
> But, if I do:
>
> bash$ mpirun --prefix /usr/local/openmpi -np 2 prueba.bin
>
> the result is:
>
> ----------------------------------------------------------------------
> ----
> Failed to find the following executable:
>
> Host: bernie-3
> Executable: prueba.bin
>
> Cannot continue.
> ----------------------------------------------------------------------
> ----
> mpirun noticed that job rank 0 with PID 0 on node "192.168.1.113"
> exited
> on signal 4.
>
> I've been looking around, but have not been able to find what does the
> signal 4 means.
>
> Just in case, I was running an example program wich runs fine at my
> university cluster. Nevertheless, decided to run an even simpler
> one, wich
> I include, for it may be that the error is there (I definitly hope
> not!...)
>
> program test
>
> use mpi
>
> implicit none
>
> integer :: myid,sizze,ierr
>
> call MPI_INIT(ierr)
> call MPI_COMM_SIZE(MPI_COMM_WORLD,sizze,ierr)
> call MPI_COMM_RANK(MPI_COMM_WORLD,myid,ierr)
>
> print *,"I'm using ",sizze," processors"
> print *,"of wich I'm the number ",myid
>
> call MPI_FINALIZE(ierr)
>
> end program test
>
>
> This is the first time I have installed -and use- any parallel
> programing
> program or library, and I'm doing it as a personal proyect for a
> graduate
> curse, so any help will be greatly appreciated!
>
> Best regards
>
> Jose Colmenares
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users