Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: pooja_at_[hidden]
Date: 2007-04-03 16:57:53


Hi
I need to find when the underlying network is free. Means I dont need to
go into the details of how MPi_send is implemented.

What I want to know is when the MPI_Send is started .Or rather when MPi
does not use the underlying network.

I need to find timing for
1) When the application issue send command
2)When Mpi actually issues send command
3) When does BTl perform atual transfer(send)
4)When doe send complete
5) Who was thr receiver.
etc. this was an example of MPi_send.
like this I need to know MPI_Isend,broadcast etc.

I guess this can be done using PMPI.
But PMPI can do it during profile stages while I want all this data during
runtime.
So that I can improve the performance of the system while using that ideal
time.

Well I/o used is Lustre (its ROMIO).
What I mean by I/O node is nodes that does input and ouput processing i.e
they write to lustre and compute node just transfer data to i/o node to
write it in Lustre.Compute node does not have memory at all.So when ever
they have something to write it gets transfered to I/o node.
and then I/o node does read and write.

So when MPi_send is not issued the the network(Infiniband interconnect)
can be used for some other transfer.

Can anyone help me wih how to go abt tracing this at run time?

Please help
Pooja

> On Apr 3, 2007, at 9:07 AM, pooja_at_[hidden] wrote:
>
>> Actually I am working on the course project in which I am running a
>> huge
>> computational intensive code.
>> I am running this code on cluster.
>> Now my work is to find out when does the process send control messages
>> (e.g. compute process to I/O process indicating I/O data is ready)
>
> By "I/O", do you mean stdin/stdout/stderr, or other file I/O?
>
> If you mean stdin/stdout/stderr, this is handled by the IOF (I/O
> Forwarding) framework/components in Open MPI. It's somewhat
> complicated, system-level code involving logically multiplexing data
> sent across pipes to sockets (i.e., local process(es) to remote
> process(es)).
>
> If you mean MPI-2 file I/O, you want to look at the ROMIO package; it
> handles all the MPI-2 API for I/O.
>
> Or do you mean "I/O" such as normal MPI messages (such as those
> generated by MPI_SEND and MPI_RECV)? FWIW, we normally refer to
> these as MPI messages, not really "I/O" (we typically reserve the
> term "I/O" for file IO and/or stdin/stdout/stderr).
>
> Which do you mean?
>
>> and when does they send actual data (e.g I/O nodes fetching actual
>> data
>> that is to be transfered.)
>
> This seems to imply that you're talking about parallel/network
> filesystems. I have to admit that I'm now quite confused about what
> you're asking for. :-)
>
>> And I have to log the timing and duration in other file.
>
> If you need to log the timing and duration of MPI calls, this is
> pretty easy to do with the PMPI interface -- you can intercept all
> MPI calls, log whatever information you want to log, invoke the
> underlying MPI function to do the real work, and then log the duration.
>
>> For this I need to know the States of Open MPi (Control messges)
>> So that I can simply put print statements in Open MPi code and find
>> out
>> how it works.
>
> I would [strongly] advise using a debugger. Printf statements will
> only take you so far, and can be quite confusing in a parallel
> scenario -- especially when they can alter the timing of the system
> (i.e., Heisenburg kinds of effects).
>
>> For this reason I was asking to know the state changes or atleast
>> the way
>> to find it out.
>
> I'm still not clear on what state changes you're asking about.
>
> From this e-mail and your prior e-mails, it *seems* like you're
> asking about how data gets from MPI_SEND in one process to MPI_RECV
> in another process. Is that right?
>
> If so, I would not characterize the code that does this as a state
> machine in the traditional sense. Sure, as a computer program, it
> technically *is* a state machine that changes states according to
> assembly instructions, registers, etc., but we did not use generic
> state machine abstractions throughout the code base. In many places,
> there's simply a linear sequence of events -- not a re-entrant state
> machine.
>
> So if you're asking how a user message gets from MPI_SEND in one
> process to MPI_RECV in another, we can describe that (it's a very
> complicated answer that depends on many factors, actually -- it is
> *not* a straightforward answer, not only because OMPI deals with many
> device/network types, but also because there can be many variables
> decided at run time that determine how a message is sent from a
> process to a peer).
>
> So before we go any further -- can you, as precisely as possible,
> describe exactly what information you're looking for?
>
>> Also my proff asked me to look into BTl transport layer to be used
>> with
>> MPi Api.
>
> I described that in a prior e-mail.
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>