This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
I am getting a curious error on a simple communications test. I have altered the std mvapich osu_latency test to accept receives from any source and I get the following error
[d013.sc.net:15455] *** An error occurred in MPI_Recv
[d013.sc.net:15455] *** on communicator MPI_COMM_WORLD
[d013.sc.net:15455] *** MPI_ERR_TRUNCATE: message truncated
[d013.sc.net:15455] *** MPI_ERRORS_ARE_FATAL (goodbye)
the code change was...
MPI_Recv(r_buf, size, MPI_CHAR, MPI_ANY_SOURCE, 1, MPI_COMM_WORLD, &reqstat);
the command line I run was
> mpirun -np 2 ./osu_latency
Now I run this on 2 types of host machine configurations. One that has Infinipath HCAs installed and another that doesn't. I run both of these in shared memory mode ie: dual processes on the same node. I have verified that when I am on the host with Infinipath I am actually running the OpenMPI mpirun, not the mpi that comes with the HCA.
I have built OpenMPI with psm support from a fairly recent svn pull and run the same bins on both host machines... The config was as follows:
> $ ../configure --prefix=/opt/wkspace/openmpi-1.3 CC=gcc CXX=g++
> --disable-mpi-f77 --enable-debug --enable-memchecker
> --with-psm=/usr/include --with-valgrind=/opt/wkspace/valgrind-3.3.0/
> mpirun --version
mpirun (Open MPI) 1.4a1r18908
The error presents itself only on the host that does not have Infinipath installed. I have combed through the mca args to see if there is a setting I am missing but I cannot see anything obvious.
Any input would be appreciated. Thanks. Tom