Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] mpirun error in OpenMPI 1.5
From: Nguyen Toan (nguyentoan1508_at_[hidden])
Date: 2010-12-08 15:05:21


Dear Ralph,

Thank you for your reply. I did check the ld_library_path and recompile with
the new version and it worked perfectly.
Thank you again.

Best Regards,
Toan

On Thu, Dec 9, 2010 at 12:30 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> That could mean you didn't recompile the code using the new version of
> OMPI. The 1.4 and 1.5 series are not binary compatible - you have to
> recompile your code.
>
> If you did recompile, you may be getting version confusion on the backend
> nodes - you should check your ld_library_path and ensure it is pointing to
> the 1.5 series install.
>
> On Dec 8, 2010, at 8:02 AM, Nguyen Toan wrote:
>
> > Dear all,
> >
> > I am having a problem while running mpirun in OpenMPI 1.5 version. I
> compiled OpenMPI 1.5 with BLCR 0.8.2 and OFED 1.4.1 as follows:
> >
> > ./configure \
> > --with-ft=cr \
> > --enable-mpi-threads \
> > --with-blcr=/home/nguyen/opt/blcr \
> > --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
> > --prefix=/home/nguyen/opt/openmpi-1.5 \
> > --with-openib \
> > --enable-mpirun-prefix-by-default
> >
> > For programs under "openmpi-1.5/examples" folder, mpirun tests were
> successful. But mpirun aborted immediately when running a program in MPI
> CUDA code, which was tested successfully with OpenMPI 1.4.3. Below is the
> error message.
> >
> > Can anyone give me an idea about this error?
> > Thank you.
> >
> > Best Regards,
> > Toan
> > ----------------------
> >
> >
> > [rc002.local:17727] [[56831,1],1] ORTE_ERROR_LOG: Data unpack would read
> past end of buffer in file util/nidmap.c at line 371
> >
> --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_ess_base_build_nidmap failed
> > --> Returned value Data unpack would read past end of buffer (-26)
> instead of ORTE_SUCCESS
> >
> --------------------------------------------------------------------------
> > [rc002.local:17727] [[56831,1],1] ORTE_ERROR_LOG: Data unpack would read
> past end of buffer in file base/ess_base_nidmap.c at line 62
> > [rc002.local:17727] [[56831,1],1] ORTE_ERROR_LOG: Data unpack would read
> past end of buffer in file ess_env_module.c at line 173
> >
> --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_ess_set_name failed
> > --> Returned value Data unpack would read past end of buffer (-26)
> instead of ORTE_SUCCESS
> >
> --------------------------------------------------------------------------
> > [rc002.local:17727] [[56831,1],1] ORTE_ERROR_LOG: Data unpack would read
> past end of buffer in file runtime/orte_init.c at line 132
> >
> --------------------------------------------------------------------------
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or
> environment
> > problems. This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> > ompi_mpi_init: orte_init failed
> > --> Returned "Data unpack would read past end of buffer" (-26) instead
> of "Success" (0)
> >
> --------------------------------------------------------------------------
> > *** An error occurred in MPI_Init
> > *** before MPI was initialized
> > *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> > [rc002.local:17727] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> >
> --------------------------------------------------------------------------
> > mpirun has exited due to process rank 1 with PID 17727 on
> > node rc002 exiting improperly. There are two reasons this could occur:
> >
> > 1. this process did not call "init" before exiting, but others in
> > the job did. This can cause a job to hang indefinitely while it waits
> > for all processes to call "init". By rule, if one process calls "init",
> > then ALL processes must call "init" prior to termination.
> >
> > 2. this process called "init", but exited without calling "finalize".
> > By rule, all processes that call "init" MUST call "finalize" prior to
> > exiting or it will be considered an "abnormal termination"
> >
> > This may have caused other processes in the application to be
> > terminated by signals sent by mpirun (as reported here).
> >
> --------------------------------------------------------------------------
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>