Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] r27078 and OMPI build
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-08-21 12:31:27


Looks to me like you just need to add a couple of includes and correct a typo - yes?

The library issue sounds like something isn't right in the Makefile.am - perhaps the syntax has a typo there as well?

On Aug 21, 2012, at 9:08 AM, "Shamis, Pavel" <shamisp_at_[hidden]> wrote:

> Evgeny,
>
> I don't have access to Solaris system, but please let me know if there a way to help you.
>
> Pavel (Pasha) Shamis
> ---
> Computer Science Research Group
> Computer Science and Math Division
> Oak Ridge National Laboratory
>
>
>
>
>
>
> On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:
>
> r27078 (ML collective component) broke some Solaris OMPI builds.
>
> 1) In ompi/mca/coll/ml/coll_ml_lmngr.c
> 199 #ifdef HAVE_POSIX_MEMALIGN
> 200 if((errno = posix_memalign(&lmngr->base_addr,
> 201 lmngr->list_alignment,
> 202 lmngr->list_size * lmngr->list_block_size))
> != 0) {
> 203 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
> strerror(errno)));
> 204 return OMPI_ERROR;
> 205 }
> 206 #else
> 207 lmngr->base_addr =
> 208 malloc(lmngr->list_size * lmngr->list_block_size +
> lmngr->list_alignment);
> 209 if(NULL == lmngr->base_addr) {
> 210 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
> strerror(errno)));
> 211 return OMPI_ERROR;
> 212 }
> 213
> 214 lmngr->base_addr =
> (void*)OPAL_ALIGN((uintptr_t)lmngr->base_addr,
> 215 lmngr->list_align, uintptr_t);
> 216 #endif
> The "#else" code path has multiple problems -- specifically at the
> statement on lines 214-215:
> - OPAL_ALIGN needs to be defined (e.g., #include "opal/align.h")
> - uintptr_t need to be defined (e.g., #include "opal_stdint.h")
> - list_align should be list_alignment
>
> I could fix, but need help with...
>
> 2) http://www.open-mpi.org/mtt/index.php?do_redir=2089 Somehow,
> coll_ml is getting pulled into libmpi.so. E.g., this doesn't look right:
>
> % nm ompi/.libs/libmpi.so | grep mca_coll_ml
> [13161] | 2556704| 172|FUNC |LOCL |0 |11
> |mca_coll_ml_alloc_op_prog_single_frag_dag
> [13171] | 2555488| 344|FUNC |LOCL |0 |11
> |mca_coll_ml_buffer_recycling
> [13173] | 2555392| 92|FUNC |LOCL |0 |11 |mca_coll_ml_err
> [23992] | 0| 0|FUNC |GLOB |0 |UNDEF
> |mca_coll_ml_memsync_intra
>
> The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
> stuff shouldn't be in there at all in the first place. This is on one
> Solaris system, while another doesn't see the problem and builds fine.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]<mailto:devel_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel