Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] bug in MPI_ACCUMULATE for window offsets > 2**31 - 1 bytes? openmpi v1.2.5
From: Tim Prins (tprins_at_[hidden])
Date: 2008-02-07 11:42:55


Hi Stefan,

I was able to verify the problem. Turns out this is a problem with other
onesided operations as well. Attached is a simple test case I made in c
using MPI_Put that also fails.

The problem is that the target count and displacements are both sent as
signed 32 bit integers. Then, the receiver multiplies them together and
adds them to the window base. However, this multiplication is done using
the signed 32 bit integers, which overflows. This is then added to the
64 bit pointer. This, of course, results in a bad address.

I have attached a patch against a recent development version that fixes
this for me. I am also copying Brian Barrett, who did all the work on
the onesided code.

Brian: if possible, please take a look at the attached patch and test case.

Thanks for the report!

Tim Prins

Stefan Knecht wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi all,
>
> I encounter a problem with the routine MPI_ACCUMULATE trying to sum up
> MPI_REAL8's on a large memory window with a large offset.
> My program running (on a single processor, x86_64 architecture) crashes
> with
> an error message like:
>
> [node14:16236] *** Process received signal ***
> [node14:16236] Signal: Segmentation fault (11)
> [node14:16236] Signal code: Address not mapped (1)
> [node14:16236] Failing at address: 0x2aaa32b16000
> [node14:16236] [ 0] /lib64/libpthread.so.0 [0x32e080de00]
> [node14:16236] [ 1]
> /home/stefan/bin/openmpi-1.2.5/lib/libmpi.so.0(ompi_mpi_op_sum_double+0x10)
> [0x2aaaaaf15530]
> [node14:16236] [ 2]
> /home/stefan/bin/openmpi-1.2.5/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_process_op+0x2d7)
>
> [0x2aaab1a47257]
> [node14:16236] [ 3]
> /home/stefan/bin/openmpi-1.2.5/lib/openmpi/mca_osc_pt2pt.so
> [0x2aaab1a45432]
> [node14:16236] [ 4]
> /home/stefan/bin/openmpi-1.2.5/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_passive_unlock+0x93)
>
> [0x2aaab1a48243]
> [node14:16236] [ 5]
> /home/stefan/bin/openmpi-1.2.5/lib/openmpi/mca_osc_pt2pt.so
> [0x2aaab1a43436]
> [node14:16236] [ 6]
> /home/stefan/bin/openmpi-1.2.5/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_progress+0xff)
>
> [0x2aaab1a42e0f]
> [node14:16236] [ 7]
> /home/stefan/bin/openmpi-1.2.5/lib/libopen-pal.so.0(opal_progress+0x4a)
> [0x2aaaab3dfa0a]
> [node14:16236] [ 8]
> /home/stefan/bin/openmpi-1.2.5/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_module_unlock+0x2a9)
>
> [0x2aaab1a48629]
> [node14:16236] [ 9]
> /home/stefan/bin/openmpi-1.2.5/lib/libmpi.so.0(PMPI_Win_unlock+0xe1)
> [0x2aaaaaf4a291]
> [node14:16236] [10]
> /home/stefan/bin/openmpi-1.2.5/lib/libmpi_f77.so.0(mpi_win_unlock_+0x25)
> [0x2aaaaacdd8c5]
> [node14:16236] [11] /home/stefan/calc/mpi2_test/a.out(MAIN__+0x809)
> [0x401851]
> [node14:16236] [12] /home/stefan/calc/mpi2_test/a.out(main+0xe) [0x401bbe]
> [node14:16236] [13] /lib64/libc.so.6(__libc_start_main+0xf4) [0x32dfc1dab4]
> [node14:16236] [14] /home/stefan/calc/mpi2_test/a.out [0x400f99]
> [node14:16236] *** End of error message ***
> mpirun noticed that job rank 0 with PID 16236 on node node14 exited on
> signal 11 (Segmentation fault).
>
>
> The relevant part of my FORTRAN source code reads as:
>
> ~ program accumulate_test
> ~ IMPLICIT REAL*8 (A-H,O-Z)
> ~ include 'mpif.h'
> ~ INTEGER(KIND=MPI_OFFSET_KIND) MX_SIZE_M
> C dummy size parameter
> ~ PARAMETER (MX_SIZE_M = 1 000 000)
> ~ INTEGER MPIerr, MYID, NPROC
> ~ INTEGER ITARGET, MY_X_WIN, JCOUNT, JCOUNT_T
> ~ INTEGER(KIND=MPI_ADDRESS_KIND) MEM_X, MEM_Y
> ~ INTEGER(KIND=MPI_ADDRESS_KIND) IDISPL_WIN
> ~ INTEGER(KIND=MPI_ADDRESS_KIND) PTR1, PTR2
> ~ INTEGER(KIND=MPI_INTEGER_KIND) ISIZE_REAL8
> ~ INTEGER*8 NELEMENT_X, NELEMENT_Y
> ~ POINTER (PTR1, XMAT(MX_SIZE_M))
> ~ POINTER (PTR2, YMAT(MX_SIZE_M))
> C
> ~ CALL MPI_INIT( MPIerr )
> ~ CALL MPI_COMM_RANK( MPI_COMM_WORLD, MYID, MPIerr)
> ~ CALL MPI_COMM_SIZE( MPI_COMM_WORLD, NPROC, MPIerr)
> C
> ~ NELEMENT_X = 400 000 000
> ~ NELEMENT_Y = 10 000
> C
> ~ CALL MPI_TYPE_EXTENT(MPI_REAL8, ISIZE_REAL8, MPIerr)
> ~ MEM_X = NELEMENT_X * ISIZE_REAL8
> ~ MEM_Y = NELEMENT_Y * ISIZE_REAL8
> C
> C allocate memory
> C
> ~ CALL MPI_ALLOC_MEM( MEM_X, MPI_INFO_NULL, PTR1, MPIerr)
> ~ CALL MPI_ALLOC_MEM( MEM_Y, MPI_INFO_NULL, PTR2, MPIerr)
> C
> C fill vectors with 0.0D0 and 1.0D0
> C
> ~ CALL DZERO(XMAT,NELEMENT_X)
> ~ CALL DONE(YMAT,NELEMENT_Y)
> C
> C open memory window
> C
> ~ CALL MPI_WIN_CREATE( XMAT, MEM_X, ISIZE_REAL8,
> ~ & MPI_INFO_NULL, MPI_COMM_WORLD,
> ~ & MY_X_WIN, MPIerr )
> C lock window (MPI_LOCK_SHARED mode)
> C select target ==> if itarget == myid: no 1-sided communication
> C
> ~ ITARGET = MYID
> ~ CALL MPI_WIN_LOCK( MPI_LOCK_SHARED, ITARGET, MPI_MODE_NOCHECK,
> ~ & MY_X_WIN, MPIerr)
> C
> C transfer data to target ITARGET
> C
> ~ JCOUNT_T = 10 000
> ~ JCOUNT = JCOUNT_T
> C set displacement in memory window
> ~ IDISPL_WIN = 300 000 000
> C
> ~ CALL MPI_ACCUMULATE( YMAT, JCOUNT, MPI_REAL8, ITARGET, IDISPL_WIN,
> ~ & JCOUNT_T, MPI_REAL8, MPI_SUM, MY_X_WIN, MPIerr)
> C
> C unlock
> C
> ~ CALL MPI_WIN_UNLOCK( ITARGET, MY_X_WIN, MPIerr)
> ...
>
> The complete source code (accumulate_test.F) is attached to this e-mail
> as well as the
> config.log of my OpenMPI installation.
>
> The program only(!) fails for values of IDISPL_WIN > 268 435 455!!! For
> all lower
> offset values it finishes normally.
>
> Therefore, I assume that after the internal multiplication (in
> MPI_ACCUMULATE)
> of IDISPL_WIN with the window scaling factor ISIZE_REAL8 (== 8 byte) an
> INTEGER(*4) overflow occurs, although IDISPL_WIN is declared as
> KIND=MPI_ADDRESS_KIND (INTEGER*8). Might that be the reason?
>
> Running this program doing rather MPI_GET than MPI_ACCUMULATE with
> the same offsets is no problem at all.
>
> Thanks in advance for any help,
>
> stefan
>
> - --
> - -------------------------------------------
> Dipl. Chem. Stefan Knecht
> Institute for Theoretical and
> Computational Chemistry
> Heinrich-Heine University Düsseldorf
> Universitätsstraße 1
> Building 26.32 Room 03.33
> 40225 Düsseldorf
>
> phone: +49-(0)211-81-11439
> e-mail: stefan_at_[hidden]
> http://www.theochem.uni-duesseldorf.de/users/stefan
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFHmcaQFgKivGtHXsARAqG2AJ9xjTXKs5+Y3hoNd0g93Ue3ceFnUACdEmQN
> MyOMP2fGCOEzrTwaNZAWPsA=
> =P17R
> -----END PGP SIGNATURE-----
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users


#include <mpi.h>

#define X_NELMTS 400000000
#define Y_NELMTS 10000
#define OFFSET 300000000
#define TYPE MPI_DOUBLE

int main(int argc, char **argv) {
    int rank;
    void * x, *y;
    MPI_Win win;
    MPI_Aint extent;
    
    MPI_Init(NULL, NULL);
    MPI_Type_extent(TYPE, &extent);
    x = malloc(X_NELMTS * extent);
    y = malloc(Y_NELMTS * extent);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Win_create(x, X_NELMTS, extent, MPI_INFO_NULL, MPI_COMM_WORLD, &win);

    MPI_Win_lock(MPI_LOCK_SHARED, rank, MPI_MODE_NOCHECK, win);

    MPI_Put(y, Y_NELMTS, TYPE, rank, OFFSET, Y_NELMTS, TYPE, win);

    MPI_Win_unlock(rank, win);

    MPI_Win_free(&win);

    free(x);
    free(y);

    MPI_Finalize();
    return 0;
}

Index: ompi/mca/osc/rdma/osc_rdma_data_move.c
===================================================================
--- ompi/mca/osc/rdma/osc_rdma_data_move.c (revision 17358)
+++ ompi/mca/osc/rdma/osc_rdma_data_move.c (working copy)
@@ -183,7 +183,7 @@
             descriptor->des_dst_cnt = 1;
             descriptor->des_dst[0].seg_addr.lval =
                 module->m_peer_info[target].peer_base +
- (sendreq->req_target_disp * module->m_win->w_disp_unit);
+ ((unsigned long)sendreq->req_target_disp * module->m_win->w_disp_unit);
             descriptor->des_dst[0].seg_len =
                 sendreq->req_origin_bytes_packed;
             descriptor->des_dst[0].seg_key.key64 =
@@ -213,7 +213,7 @@
             descriptor->des_src_cnt = 1;
             descriptor->des_src[0].seg_addr.lval =
                 module->m_peer_info[target].peer_base +
- (sendreq->req_target_disp * module->m_win->w_disp_unit);
+ ((unsigned long)sendreq->req_target_disp * module->m_win->w_disp_unit);
             descriptor->des_src[0].seg_len =
                 sendreq->req_origin_bytes_packed;
             descriptor->des_src[0].seg_key.key64 =
@@ -790,7 +790,7 @@
 {
     int ret = OMPI_SUCCESS;
     void *target = (unsigned char*) module->m_win->w_baseptr +
- (header->hdr_target_disp * module->m_win->w_disp_unit);
+ ((unsigned long)header->hdr_target_disp * module->m_win->w_disp_unit);
     ompi_proc_t *proc = ompi_comm_peer_lookup( module->m_comm, header->hdr_origin );
     struct ompi_datatype_t *datatype =
         ompi_osc_base_datatype_create(proc, inbuf);
@@ -889,7 +889,7 @@
     ompi_osc_rdma_module_t *module = longreq->req_module;
     unsigned char *target_buffer =
         (unsigned char*) module->m_win->w_baseptr +
- (header->hdr_target_disp * module->m_win->w_disp_unit);
+ ((unsigned long)header->hdr_target_disp * module->m_win->w_disp_unit);
 
     /* lock the window for accumulates */
     OPAL_THREAD_LOCK(&longreq->req_module->m_acc_lock);
@@ -971,7 +971,7 @@
         unsigned char *target_buffer;
 
         target_buffer = (unsigned char*) module->m_win->w_baseptr +
- (header->hdr_target_disp * module->m_win->w_disp_unit);
+ ((unsigned long)header->hdr_target_disp * module->m_win->w_disp_unit);
 
         /* lock the window for accumulates */
         OPAL_THREAD_LOCK(&module->m_acc_lock);
Index: ompi/mca/osc/pt2pt/osc_pt2pt_data_move.c
===================================================================
--- ompi/mca/osc/pt2pt/osc_pt2pt_data_move.c (revision 17358)
+++ ompi/mca/osc/pt2pt/osc_pt2pt_data_move.c (working copy)
@@ -522,7 +522,7 @@
 {
     int ret = OMPI_SUCCESS;
     void *target = (unsigned char*) module->p2p_win->w_baseptr +
- (header->hdr_target_disp * module->p2p_win->w_disp_unit);
+ ((unsigned long)header->hdr_target_disp * module->p2p_win->w_disp_unit);
     ompi_proc_t *proc = ompi_comm_peer_lookup( module->p2p_comm, header->hdr_origin );
     struct ompi_datatype_t *datatype =
         ompi_osc_base_datatype_create(proc, &inbuf);
@@ -605,7 +605,7 @@
     void *payload = (void*) (header + 1);
     int ret;
     void *target = (unsigned char*) module->p2p_win->w_baseptr +
- (header->hdr_target_disp * module->p2p_win->w_disp_unit);
+ ((unsigned long)header->hdr_target_disp * module->p2p_win->w_disp_unit);
 
     /* lock the window for accumulates */
     OPAL_THREAD_LOCK(&longreq->req_module->p2p_acc_lock);
@@ -677,7 +677,7 @@
     struct ompi_datatype_t *datatype =
         ompi_osc_base_datatype_create(proc, &payload);
     void *target = (unsigned char*) module->p2p_win->w_baseptr +
- (header->hdr_target_disp * module->p2p_win->w_disp_unit);
+ ((unsigned long)header->hdr_target_disp * module->p2p_win->w_disp_unit);
 
     if (NULL == datatype) {
         opal_output(ompi_osc_base_output,