Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Replacing poll()
From: Alex Margolin (alex.margolin_at_[hidden])
Date: 2012-03-03 18:18:44

I've figured that what I really need is to write my own BTL component,
rather then trying to manipulate the existing TCP one. I've started
writing it using the 1.5.5rc3 tarball and some pdfs from 2006 I found on
the website (anything else I can look at? TCP is much more complicated
then what I'm writing). I think I'm getting the hang of it, but I still
have some questions about terminology for the component implementation:

The basic data structures for routing fragments are components, modules,
interfaces and endpoints, right? So, If I have 3 nodes, each with 2
interfaces (each having one constant IP), and i'm running 2 processes
total. I'll have... 1 component, 2 modules, 4 interfaces (2 per module)
and 4 addresses?
What about "links" (as in "num_of_links" component struct member) - what
does it count?

ompi_modex_send - Is it supposed to share the addresses of all the
running processes before they start? suppose I assume one NIC per
machine. Can I just send an array of mca_btl_tcp_addr_t, and every
process will find the one belonging to him by some index (his rank?). I
saw the ompi_modex_recv() call in _proc.c and it seems that every proc
instance reads the entire sent buffer anyway.

Sorry for flooding you all with questions, I hope I'm not way off here.
I hope I'll finish writing something by the end of next week (I'm
working on this after hours, not full time), with the purpose of
submitting it as a contribution to open-mpi.

Appreciate your help so far,

On 03/02/2012 09:26 PM, Jeffrey Squyres wrote:
> Give your btl progress function. It'll get called quite frequently.
> Look at the "progress" section in btl.h. Progress threads don't work yet, but the btl_progress function will get called by the PML quite frequently. It's how BTL's like openib progress their outstanding message passing.
> On Mar 2, 2012, at 2:22 PM, Alex Margolin wrote:
>> On 03/02/2012 04:33 PM, Jeffrey Squyres wrote:
>>> Note that the OMPI 1.4.x series is about to be retired. If you're doing new stuff, I'd advise you to be working with the Open MPI SVN trunk. In the trunk, we've changed how we build libevent, so if you're adding to it, you probably want to be working there for max forward-compatibility.
>>> That being said:
>>>> I know trying to replace poll() seems like I'm doing something very wrong, but I want to poll on events without a valid linux file descriptor (and existing events, specifically sockets, at the same time), and I see no other way. Obviously, my poll2 calls the linux poll in most cases.
>>> What exactly are you trying to do? OMPI has some internal hooks for non-fd-or-event-based progress. Indeed, libevent is typically called with fairly low frequency (e.g., if you're running with OpenFabrics or some other high-speed/not-fd-based networking interconnect).
>> I'm trying to create a new btl module. I've written an adapter from my library to TCP, so I've implemented socket/connect/accept/send/recv... now I've taken the TCP BTL module and cloned it - replacing the relevant calls with mine. My only problem is with poll, which is not in the MCA (at least in 1.4.x).
>> I've implemented poll() and select() but it's not that good, because my events are not based on valid linux file descriptors, but I can poll all my events at the same time (but not in conjunction with real FDs, unfortunatly).
>> Can you give me some pointers as to where to look in the MPI (1.5?) source code to implement it properly?
>> Thanks,
>> Alex
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]