Thanks a lot.. You and  Adrian have cleared a lot of my concepts and its time for me to develop a functional framework. I will get back to you guys when I am done with framework..... and having more problems and/or conceptual issues.

The mca_btl_tcp_addr_t issue was resolved as correctly pointed by you.  I didnt go into the detail, but i think I must have had corrupted the code somewhere. The fresh tar, configure and make all install did the trick.
 
Best Regards,
Muhammad Atif


----- Original Message ----
From: Jeff Squyres <jsquyres@cisco.com>
To: Open MPI Developers <devel@open-mpi.org>
Sent: Saturday, January 19, 2008 11:54:09 AM
Subject: Re: [OMPI devel] btl tcp port to xensocket

On Jan 17, 2008, at 7:08 PM, Muhammad Atif wrote:

> Thanks again. Nope.. at the moment I am doing the lame stuff i.e. 
> simply changing the tcp code. So I have not created another btl 
> component. I know its not recommended thing, but I just wanted to 
> try before committing.

That makes perfect sense.  Ok, so you're not running into a component 
name collision within the modex; that's good.

> Apart from xensocket specific stuff, all what I have done inside the  
> btl/tcp code is to change the structure
>
>  struct  mca_btl_tcp_addr_t {
>    struct in_addr addr_inet;    /**< IPv4 address in network byte 
> order */
>    in_port_t      addr_port;    /**< listen port */
>    unsigned short addr_inuse;    /**< local meaning only */
>    int          xs_domU_ref;      /**<xs: domU memory reference   */
> };
>
> I wanted this structure to be passed on to all peers through 
> component exchange (modex send/recv).  This way I have the normal 
> socket listen port, its address and xensocket memory reference (its 
> not complete as it is missing some other info, but lets stick to 
> basic stuff).

Sounds reasonable.

> The second question is regarding btl tcp recv. I have seen a couple 
> of emails with some explanation specific to that particular user but  
> cannot seem to answer this question (ref to previous email).

> Second question is regarding the receive part of openmpi. In my
> understanding, once Recv api is called, the control goes through PML
> layer and everything initializes there. However, I am unable to get
> a lock at the layer/file/function where the receive socket polling
> is done. There are callbacks, but where or how exactly the openMPI
> knows that message has in fact arrived. Any pointer will do :)

All file descriptor process is handled by libevent down in opal. 
libevent is a third party library that we imported into Open MPI (and 
modified a bit) that handles generic fd issues.  For example, we 
register fd's with libevent and tell libevent that we want callbacks 
when the fd is ready for reading or writing (depending on the context).

libevent's event loop is invoked by opal_progress(), which is called 
in lots of places.  Hence, the tcp btl can be called back whenever 
opal_progress() is invoked, because opal_progress() will invoke 
libevent, and if any socket fd's that the tcp btl registered are 
reading for reading, or if there are pending writes occurred on some 
socket fd's and those fd's are ready for writing, their callbacks will  
be invoked.

Make sense?

> PS: I would love if you do some explanation of modex recv as well. ;)
> Thanks for all the support you guys are giving.

I think Adrian was referring to how the modex works.  Remember that 
the modex send is just a local memcpy; all the modex data is them 
glommed up into a single network send communication later.  After 
that, it gets a big network message with *everyone's* modex data, that  
is then split up and categorized by component and sender.  The modex 
receive is then another memcpy.

So as to why you're still getting sizeof(mca_btl_tcp_addr_t)==8 in the  
tcp modex receiver, the only thing I can think of is that you somehow 
didn't recompile properly.  Did you try making clean in the tcp btl 
dir and then a "make all" to ensure that everything recompiled 
properly with your modified struct in btl_tcp_addr.h?  Normally, the 
build system should take care of such dependencies, but...

--
Jeff Squyres
Cisco Systems

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Looking for last minute shopping deals? Find them fast with Yahoo! Search.