There was a bug in the code. So now I get this, which is correct but how do I get rid of all these ABI, CMA, etc. messages?
$ mpiexec -n 4 ./a.out
librdmacm: couldn't read ABI version.
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
CMA: unable to get RDMA device list
librdmacm: assuming: 4
CMA: unable to get RDMA device list
CMA: unable to get RDMA device list
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
CMA: unable to get RDMA device list
--------------------------------------------------------------------------
[[6110,1],1]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: elzbieta
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
rank= 1 Results: 5.0000000 6.0000000 7.0000000 8.0000000
rank= 2 Results: 9.0000000 10.000000 11.000000 12.000000
rank= 0 Results: 1.0000000 2.0000000 3.0000000 4.0000000
rank= 3 Results: 13.000000 14.000000 15.000000 16.000000
[elzbieta:02559] 3 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[elzbieta:02559] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
On Sat, Sep 15, 2012 at 3:34 PM, John Chludzinski <john.chludzinski@gmail.com> wrote:BTW, here the example code:
program scatter
include 'mpif.h'
integer, parameter :: SIZE=4
integer :: numtasks, rank, sendcount, recvcount, source, ierr
real :: sendbuf(SIZE,SIZE), recvbuf(SIZE)
! Fortran stores this array in column major order, so the
! scatter will actually scatter columns, not rows.
data sendbuf /1.0, 2.0, 3.0, 4.0, &
5.0, 6.0, 7.0, 8.0, &
9.0, 10.0, 11.0, 12.0, &
13.0, 14.0, 15.0, 16.0 /
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)
if (numtasks .eq. SIZE) then
source = 1
sendcount = SIZE
recvcount = SIZE
call MPI_SCATTER(sendbuf, sendcount, MPI_REAL, recvbuf, &
recvcount, MPI_REAL, source, MPI_COMM_WORLD, ierr)
print *, 'rank= ',rank,' Results: ',recvbuf
else
print *, 'Must specify',SIZE,' processors. Terminating.'
endif
call MPI_FINALIZE(ierr)
end programOn Sat, Sep 15, 2012 at 3:02 PM, John Chludzinski <john.chludzinski@gmail.com> wrote:
# export LD_LIBRARY_PATHLD_LIBRARY_PATH=/usr/lib/openmpi/lib/
# mpiexec -n 1 printenv | grep PATH
# mpiexec -n 4 ./a.out
PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/jski/.local/bin:/home/jski/bin
MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
WINDOWPATH=1
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
CMA: unable to get RDMA device list
--------------------------------------------------------------------------
[[3598,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: elzbieta
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
librdmacm: couldn't read ABI version.
CMA: unable to get RDMA device list
librdmacm: assuming: 4
CMA: unable to get RDMA device list
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
CMA: unable to get RDMA device list
[elzbieta:4145] *** An error occurred in MPI_Scatter
[elzbieta:4145] *** on communicator MPI_COMM_WORLD
[elzbieta:4145] *** MPI_ERR_TYPE: invalid datatype
[elzbieta:4145] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpiexec has exited due to process rank 1 with PID 4145 on
node elzbieta exiting improperly. There are two reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
On Sat, Sep 15, 2012 at 2:24 PM, Ralph Castain <rhc@open-mpi.org> wrote:Ah - note that there is no LD_LIBRARY_PATH in the environment. That's the problemOn Sep 15, 2012, at 11:19 AM, John Chludzinski <john.chludzinski@gmail.com> wrote:$ which mpiexec
/usr/lib/openmpi/bin/mpiexec
# mpiexec -n 1 printenv | grep PATH
PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/jski/.local/bin:/home/jski/bin
MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
WINDOWPATH=1On Sat, Sep 15, 2012 at 1:11 PM, Ralph Castain <rhc@open-mpi.org> wrote:
Couple of things worth checking:1. verify that you executed the "mpiexec" you think you did - a simple "which mpiexec" should suffice2. verify that your environment is correct by "mpiexec -n 1 printenv | grep PATH". Sometimes the ld_library_path doesn't carry over like you think it should
On Sep 15, 2012, at 10:00 AM, John Chludzinski <john.chludzinski@gmail.com> wrote:I installed OpenMPI (I have a simple dual core AMD notebook with Fedora 16) via:_______________________________________________
# yum install openmpi
# yum install openmpi-devel
# mpirun --version
mpirun (Open MPI) 1.5.4
I added:
$ PATH=PATH=/usr/lib/openmpi/bin/:$PATH
$ LD_LIBRARY_PATH=/usr/lib/openmpi/lib/
Then:
$ mpif90 ex1.f95
$ mpiexec -n 4 ./a.out
./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot open shared object file: No such file or directory
./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot open shared object file: No such file or directory
./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot open shared object file: No such file or directory
./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot open shared object file: No such file or directory
--------------------------------------------------------------------------
mpiexec noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
ls -l /usr/lib/openmpi/lib/
total 6788
lrwxrwxrwx. 1 root root 25 Sep 15 12:25 libmca_common_sm.so -> libmca_common_sm.so.2.0.0
lrwxrwxrwx. 1 root root 25 Sep 14 16:14 libmca_common_sm.so.2 -> libmca_common_sm.so.2.0.0
-rwxr-xr-x. 1 root root 8492 Jan 20 2012 libmca_common_sm.so.2.0.0
lrwxrwxrwx. 1 root root 19 Sep 15 12:25 libmpi_cxx.so -> libmpi_cxx.so.1.0.1
lrwxrwxrwx. 1 root root 19 Sep 14 16:14 libmpi_cxx.so.1 -> libmpi_cxx.so.1.0.1
-rwxr-xr-x. 1 root root 87604 Jan 20 2012 libmpi_cxx.so.1.0.1
lrwxrwxrwx. 1 root root 19 Sep 15 12:25 libmpi_f77.so -> libmpi_f77.so.1.0.2
lrwxrwxrwx. 1 root root 19 Sep 14 16:14 libmpi_f77.so.1 -> libmpi_f77.so.1.0.2
-rwxr-xr-x. 1 root root 179912 Jan 20 2012 libmpi_f77.so.1.0.2
lrwxrwxrwx. 1 root root 19 Sep 15 12:25 libmpi_f90.so -> libmpi_f90.so.1.1.0
lrwxrwxrwx. 1 root root 19 Sep 14 16:14 libmpi_f90.so.1 -> libmpi_f90.so.1.1.0
-rwxr-xr-x. 1 root root 10364 Jan 20 2012 libmpi_f90.so.1.1.0
lrwxrwxrwx. 1 root root 15 Sep 15 12:25 libmpi.so -> libmpi.so.1.0.2
lrwxrwxrwx. 1 root root 15 Sep 14 16:14 libmpi.so.1 -> libmpi.so.1.0.2
-rwxr-xr-x. 1 root root 1383444 Jan 20 2012 libmpi.so.1.0.2
lrwxrwxrwx. 1 root root 21 Sep 15 12:25 libompitrace.so -> libompitrace.so.0.0.0
lrwxrwxrwx. 1 root root 21 Sep 14 16:14 libompitrace.so.0 -> libompitrace.so.0.0.0
-rwxr-xr-x. 1 root root 13572 Jan 20 2012 libompitrace.so.0.0.0
lrwxrwxrwx. 1 root root 20 Sep 15 12:25 libopen-pal.so -> libopen-pal.so.3.0.0
lrwxrwxrwx. 1 root root 20 Sep 14 16:14 libopen-pal.so.3 -> libopen-pal.so.3.0.0
-rwxr-xr-x. 1 root root 386324 Jan 20 2012 libopen-pal.so.3.0.0
lrwxrwxrwx. 1 root root 20 Sep 15 12:25 libopen-rte.so -> libopen-rte.so.3.0.0
lrwxrwxrwx. 1 root root 20 Sep 14 16:14 libopen-rte.so.3 -> libopen-rte.so.3.0.0
-rwxr-xr-x. 1 root root 790052 Jan 20 2012 libopen-rte.so.3.0.0
-rw-r--r--. 1 root root 301520 Jan 20 2012 libotf.a
lrwxrwxrwx. 1 root root 15 Sep 15 12:25 libotf.so -> libotf.so.0.0.1
lrwxrwxrwx. 1 root root 15 Sep 14 16:14 libotf.so.0 -> libotf.so.0.0.1
-rwxr-xr-x. 1 root root 206384 Jan 20 2012 libotf.so.0.0.1
-rw-r--r--. 1 root root 337970 Jan 20 2012 libvt.a
-rw-r--r--. 1 root root 591070 Jan 20 2012 libvt-hyb.a
lrwxrwxrwx. 1 root root 18 Sep 15 12:25 libvt-hyb.so -> libvt-hyb.so.0.0.0
lrwxrwxrwx. 1 root root 18 Sep 14 16:14 libvt-hyb.so.0 -> libvt-hyb.so.0.0.0
-rwxr-xr-x. 1 root root 428844 Jan 20 2012 libvt-hyb.so.0.0.0
-rw-r--r--. 1 root root 541004 Jan 20 2012 libvt-mpi.a
lrwxrwxrwx. 1 root root 18 Sep 15 12:25 libvt-mpi.so -> libvt-mpi.so.0.0.0
lrwxrwxrwx. 1 root root 18 Sep 14 16:14 libvt-mpi.so.0 -> libvt-mpi.so.0.0.0
-rwxr-xr-x. 1 root root 396352 Jan 20 2012 libvt-mpi.so.0.0.0
-rw-r--r--. 1 root root 372352 Jan 20 2012 libvt-mt.a
lrwxrwxrwx. 1 root root 17 Sep 15 12:25 libvt-mt.so -> libvt-mt.so.0.0.0
lrwxrwxrwx. 1 root root 17 Sep 14 16:14 libvt-mt.so.0 -> libvt-mt.so.0.0.0
-rwxr-xr-x. 1 root root 266104 Jan 20 2012 libvt-mt.so.0.0.0
-rw-r--r--. 1 root root 60390 Jan 20 2012 libvt-pomp.a
lrwxrwxrwx. 1 root root 14 Sep 15 12:25 libvt.so -> libvt.so.0.0.0
lrwxrwxrwx. 1 root root 14 Sep 14 16:14 libvt.so.0 -> libvt.so.0.0.0
-rwxr-xr-x. 1 root root 242604 Jan 20 2012 libvt.so.0.0.0
-rwxr-xr-x. 1 root root 303591 Jan 20 2012 mpi.mod
drwxr-xr-x. 2 root root 4096 Sep 14 16:14 openmpi
The file (actually, a link) it claims it can't find: libmpi_f90.so.1, is clearly there. And LD_LIBRARY_PATH=/usr/lib/openmpi/lib/.
What's the problem?
---John
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users