Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361
From: Ralph H Castain (rhc_at_[hidden])
Date: 2007-12-14 08:34:09


Hi Qiang

This error message usually indicates that you have more than one Open MPI
installation around, and that the backend nodes are picking up a different
version than mpirun is using. Check to make sure that you have a consistent
version across all the nodes.

I also noted you were building with --enable-threads. As you've probably
seen on our discussion lists, remember that Open MPI isn't really thread
safe yet. I don't think that is the problem here, but wanted to be sure you
were aware of the potential for problems.

Ralph

On 12/13/07 5:31 PM, "Qiang Xu" <Qiang.Xu_at_[hidden]> wrote:

> I installed OpenMPI-1.2.4 on our cluster.
> Here is the compute node infor
>
> [qiang_at_compute-0-1 ~]$ uname -a
> Linux compute-0-1.local 2.6.9-42.0.2.ELsmp #1 SMP Wed Aug 23 00:17:26 CDT 2006
> i686 i686 i386 GNU/Linux
> [qiang_at_compute-0-1 bin]$ gcc -v
> Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.6/specs
> Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
> --infodir=/usr/share/info --enable-shared --enable-threads=posix
> --disable-checking --with-system-zlib --enable-__cxa_atexit
> --disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux
> Thread model: posix
> gcc version 3.4.6 20060404 (Red Hat 3.4.6-8)
>
> Then I compiled NAS bechmarks, got some warning but went through.
> [qiang_at_compute-0-1 NPB2.3-MPI]$ make suite
> make[1]: Entering directory `/home/qiang/NPB2.3/NPB2.3-MPI'
> =========================================
> = NAS Parallel Benchmarks 2.3 =
> = MPI/F77/C =
> =========================================
>
> cd MG; make NPROCS=16 CLASS=B
> make[2]: Entering directory `/home/qiang/NPB2.3/NPB2.3-MPI/MG'
> make[3]: Entering directory `/home/qiang/NPB2.3/NPB2.3-MPI/sys'
> cc -g -o setparams setparams.c
> make[3]: Leaving directory `/home/qiang/NPB2.3/NPB2.3-MPI/sys'
> ../sys/setparams mg 16 B
> make.def modified. Rebuilding npbparams.h just in case
> rm -f npbparams.h
> ../sys/setparams mg 16 B
> mpif77 -c -I~/MyMPI/include mg.f
> mg.f: In subroutine `zran3':
> mg.f:1001: warning:
> call mpi_allreduce(rnmu,ss,1,dp_type,
> 1
> mg.f:2115: (continued):
> call mpi_allreduce(jg(0,i,1), jg_temp,4,MPI_INTEGER,
> 2
> Argument #1 of `mpi_allreduce' is one type at (2) but is some other type at
> (1) [info -f g77 M GLOBALS]
> mg.f:1001: warning:
> call mpi_allreduce(rnmu,ss,1,dp_type,
> 1
> mg.f:2115: (continued):
> call mpi_allreduce(jg(0,i,1), jg_temp,4,MPI_INTEGER,
> 2
> Argument #2 of `mpi_allreduce' is one type at (2) but is some other type at
> (1) [info -f g77 M GLOBALS]
> mg.f:1001: warning:
> call mpi_allreduce(rnmu,ss,1,dp_type,
> 1
> mg.f:2139: (continued):
> call mpi_allreduce(jg(0,i,0), jg_temp,4,MPI_INTEGER,
> 2
> Argument #1 of `mpi_allreduce' is one type at (2) but is some other type at
> (1) [info -f g77 M GLOBALS]
> mg.f:1001: warning:
> call mpi_allreduce(rnmu,ss,1,dp_type,
> 1
> mg.f:2139: (continued):
> call mpi_allreduce(jg(0,i,0), jg_temp,4,MPI_INTEGER,
> 2
> Argument #2 of `mpi_allreduce' is one type at (2) but is some other type at
> (1) [info -f g77 M GLOBALS]
> cd ../common; mpif77 -c -I~/MyMPI/include print_results.f
> cd ../common; mpif77 -c -I~/MyMPI/include randdp.f
> cd ../common; mpif77 -c -I~/MyMPI/include timers.f
> mpif77 -o ../bin/mg.B.16 mg.o ../common/print_results.o ../common/randdp.o
> ../common/timers.o -L~/MyMPI/lib -lmpi_f77
> make[2]: Leaving directory `/home/qiang/NPB2.3/NPB2.3-MPI/MG'
> make[1]: Leaving directory `/home/qiang/NPB2.3/NPB2.3-MPI'
> make[1]: Entering directory `/home/qiang/NPB2.3/NPB2.3-MPI'
> But when I tried to run it, I got the following error messages:
> [qiang_at_compute-0-1 bin]$ mpirun -machinefile m8 -n 16 mg.C.16
> [compute-0-1.local:11144] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate
> space in file dss/dss_unpack.c at line 90
> [compute-0-1.local:11144] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate
> space in file gpr_replica_cmd_processor.c at line 361
> I found some info on the mailling list, but it doesn't help for my case.
> Could anyone give me some advice? Or I have to upgrade the GNU compiler?
>
> Thanks.
>
> Qiang
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>