Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: r26255 has made openib unusable on Solaris platforms
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-04-13 13:34:47


I don't know about "drama", but people did clearly explain to you why this approach was unacceptable. You simply cannot cross-link at the component level. If you need something from the opal/mca/memory framework, you have to get it from the framework level.

Doesn't seem that hard a concept to grasp and follow - failing to do so breaks things for a bunch of people, which is why we don't allow it. So I hope your "configure" approach also takes this into account, or we'll have to revert it again :-(

On Apr 13, 2012, at 11:13 AM, Mike Dubman wrote:

> Too many drama - we will fix it to detect hooks availability at configure stage, this will make your life back to normal.
>
> The problem is not a Mellanox hw, but Intel PCI bus implementation, which charge extra latency if buffers are not aligned.
> The patch is a workaround for this problem and help to non-benchmark code as well.
>
>
>
> On Fri, Apr 13, 2012 at 7:06 PM, Barrett, Brian W <bwbarre_at_[hidden]> wrote:
> r2655 is awful as a patch. It doesn't work on any non-Linux platform,
> which is unpleasant. But worse, what does it possibly accomplish? In
> codes other than benchmarks, there's no advantage to aligning the pointer
> to 32 or 64 byte boundaries, as the malloced buffer very rarely is exactly
> what is sent. So you've done a whole lot of work, screwed with the memory
> allocator (which always bites OMPI in the butt), and accomplished nothing
> useful. Mellanox should fix the hardware, not make everyone's life
> miserable with crappy workarounds.
>
> MEMORY_LINUX_PTMALLOC2 is the wrong define for what they want. They
> should check for __malloc_hook and only use that code if __malloc_hook is
> found.
>
> Brian
>
> On 4/13/12 9:32 AM, "TERRY DONTJE" <terry.dontje_at_[hidden]> wrote:
>
> >
> >
> >
> > I am thinking MEMORY_LINUX_PTMALLOC2 is probably the right define to
> > key off of but this is really going to look gross ifdef'ing out the
> > lines that are accessing the Linux memory module. One other idea I
> > have is to create a dummy __malloc_hook in the Solaris memory module
> > but might there be other OSes that could run into the same
> > problem? Or what happens if PTMALLOC2 is not used (does that
> > happen)?
> >
> > --td
> >
> > On 4/13/2012 10:45 AM, TERRY DONTJE wrote:
> >
> >
> > r26255 is forcing the use of __malloc_hook which is implemented in
> > opal/mca/memory/linux however that is not compiled in the library
> > when built on Solaris thus causing a referenced symbol not found
> > when libmpi tries to load the openib btl.
> >
> > I am looking how to fix this now but if someone has a good idea
> > how to detect when __malloc_hook is used (or not) I'd be
> > interested in hearing it.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> >
> >
> >
> >
> >
> >
> >
> > Terry D. Dontje | Principal
> > Software Engineer
> > Developer
> > Tools
> > Engineering | +1.781.442.2631
> >
> > Oracle
> >
> > - Performance
> > Technologies
> >
> > 95 Network Drive, Burlington, MA 01803
> > Email terry.dontje_at_[hidden]
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >_______________________________________________
> >devel mailing list
> >devel_at_[hidden]
> >http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Brian W. Barrett
> Dept. 1423: Scalable System Software
> Sandia National Laboratories
>
>
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel