Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Yvan Fournier (yvan.fournier_at_[hidden])
Date: 2006-07-10 17:27:53


Hello,

I just retried replicating the datatype bug on a SUSE Linux 10.1 system
(on a 32-bit Pentium-M system). Actually, I even get a segmentation
fault at some point. I attach the logfile for the test case
compiled in debug mode, run once directly, the again with valgrind,
as well as my ompi_info output.

I have also encountered the bug on the "parent" case (similar, but
more complex) on my work machine (dual Xeon under Debian Sarge),
but I'll check this simpler test on it just in case.

Best regards,

        Yvan Fournier

On Sun, 2006-07-09 at 12:00 -0400, users-request_at_[hidden] wrote:
> Send users mailing list submissions to
> users_at_[hidden]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
> users-request_at_[hidden]
>
> You can reach the person managing the list at
> users-owner_at_[hidden]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
> 1. Re: Datatype bug regression from Open MPI 1.0.2 to Open MPI
> 1.1 (George Bosilca)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 8 Jul 2006 13:47:05 -0400 (Eastern Daylight Time)
> From: George Bosilca <bosilca_at_[hidden]>
> Subject: Re: [OMPI users] Datatype bug regression from Open MPI 1.0.2
> to Open MPI 1.1
> To: Open MPI Users <users_at_[hidden]>
> Message-ID: <Pine.WNT.4.64.0607081344080.2944_at_bosilca>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
> Yvan,
>
> I'm unable to replicate this one with the latest Open MPI trunk version.
> As there is no difference between the trunk and the latest 1.1 version on
> the datatype, I think the bug cannot be reproduced using the 1.1 either. I
> compiled the test twice once using the indexed datatype and once without
> and the output is exactly the same. I run it on my Apple G5 desktop as
> well as on a cluster of AMD 64, over shared memory and TCP. Can you please
> recheck that your error is comming from the type indexed please.
>
> Thanks,
> george.
>
>
> On Sat, 1 Jul 2006, Yvan Fournier wrote:
>
> > Hello,
> >
> > I had encountered a bug in Open MPI 1.0.1 using indexed datatypes
> > with MPI_Recv (which seems to be of the "off by one" sort), which
> > was corrected in Open MPI 1.0.2.
> >
> > It seems to have resurfaced in Open MPI 1.1 (I encountered it using
> > different data and did not recognize it immediately, but it seems
> > it can reproduced using the same simplified test I had sent
> > the first time, which I re-attach here just in case).
> >
> > Here is a summary of the case:
> >
> > ------------------
> >
> > Each processor reads a file ("data_p0" or "data_p1") giving a list of
> > global element ids. Some elements (vertices from a partitionned mesh)
> > may belong to both processors, so their id's may appear on both
> > processors: we have 7178 global vertices, 3654 and 3688 of them being
> > known by ranks 0 and 1 respectively.
> >
> > In this simplified version, we assign coordinates {x, y, z} to each
> > vertex equal to it's global id number for rank 1, and the negative of
> > that for rank 0 (assigning the same values to x, y, and z). After
> > finishing the "ordered gather", rank 0 prints the global id and
> > coordinates of each vertex.
> >
> > lines should print (for example) as:
> > 6456 ; 6455.00000 6455.00000 6456.00000
> > 6457 ; -6457.00000 -6457.00000 -6457.00000
> > depending on whether a vertex belongs only to rank 0 (negative
> > coordinates) or belongs to rank 1 (positive coordinates).
> >
> > With the OMPI 1.0.1 bug (observed on Suse Linux 10.0 with gcc 4.0 and on
> > Debian sarge with gcc 3.4), we have for example for the last vertices:
> > 7176 ; 7175.00000 7175.00000 7176.00000
> > 7177 ; 7176.00000 7176.00000 7177.00000
> > seeming to indicate an "off by one" type bug in datatype handling
> >
> > Not using an indexed datatype (i.e. not defining USE_INDEXED_DATATYPE
> > in the gather_test.c file), the bug dissapears.
> >
> > ------------------
> >
> > Best regards,
> >
> > Yvan Fournier
> >
> >
>
> "We must accept finite disappointment, but we must never lose infinite
> hope."
> Martin Luther King
>
>
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 328, Issue 1
> *************************************
>


                Open MPI: 1.1
   Open MPI SVN revision: r10477
                Open RTE: 1.1
   Open RTE SVN revision: r10477
                    OPAL: 1.1
       OPAL SVN revision: r10477
                  Prefix: /home/saturne/opt/openmpi-1.1.0/arch/Linux
 Configured architecture: i686-pc-linux-gnu
           Configured by: saturne
           Configured on: Tue Jun 27 21:48:56 CEST 2006
          Configure host: newhon
                Built by: saturne
                Built on: mar jun 27 22:08:15 CEST 2006
              Built host: newhon
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: no
      Fortran90 bindings: no
 Fortran90 bindings size: na
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: none
  Fortran77 compiler abs: none
      Fortran90 compiler: none
  Fortran90 compiler abs: none
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: no
     Fortran90 profiling: no
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
              MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1)
           MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1)
           MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
               MCA timer: linux (MCA v1.0, API v1.0, Component v1.1)
           MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
           MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
                MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
                  MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
               MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
                 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
                 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
              MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
                MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
                 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
                 MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
                 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
                 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
                  MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
                  MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
                 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                 MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: slurm (MCA v1.0, API v1.0, Component v1.1)
                 MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
                 MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
               MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
                MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
                MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
                 MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: slurm (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: slurm (MCA v1.0, API v1.0, Component v1.1)