Thanks a lot Jeff.

PIN is a dynamic binary instrumentation tool from Intel. It runs on top of the Binary in the MPI node. When its given function calls to instrument, it will insert trappings before/after that funtion call in the binary of the program you are instrumenting and you can insert your own functions.

I am doing some memory address profiling on benchmarks running on MPI and I was using PIN to get the Load/Store addresses. Furthermore I needed to know which LD/ST were coming from actual socket communication and which are coming from shared memory optimizations. So i needed to know which functions/where exactly were they taking that decision so that I can instrument the appropriate MPI library function call (the actual low level function, not the API like MPI_Sends/Recvs) in PIN. Hence I guess I am actually zooming down to a 1000 ft view :)

Any suggestion is welcome. I will go into the ob1 directory and try to hunt around to see how exactly its being done.

Regards,
Shamik

On Tue, Nov 22, 2011 at 10:08 AM, Shamik Ganguly <shamik.ganguly@gmail.com> wrote:
Thanks a lot Jeff.

PIN is a dynamic binary instrumentation tool from Intel. It runs on top of the Binary in the MPI node. When its given function calls to instrument, it will insert trappings before/after that funtion call in the binary of the program you are instrumenting and you can insert your own functions.

I am doing some memory address profiling on benchmarks running on MPI and I was using PIN to get the Load/Store addresses. Furthermore I needed to know which LD/ST were coming from actual socket communication and which are coming from shared memory optimizations. So i needed to know which functions/where exactly were they taking that decision so that I can instrument the appropriate MPI library function call (the actual low level function, not the API like MPI_Sends/Recvs) in PIN. Hence I guess I am actually zooming down to a 1000 ft view :)

I will go into the ob1 directory and try to hunt around to see how exactly its being done.

Regards,
Shamik


On Tue, Nov 22, 2011 at 9:04 AM, Jeff Squyres <jsquyres@cisco.com> wrote:
On Nov 22, 2011, at 1:09 AM, Shamik Ganguly wrote:

> I want to trace when the MPI  library prevents an MPI_Send from going to the socket and makes it access shared memory because the target node is on the same chip (CMP). I want to use PIN to trace this. Can you please give me some pointers about which functions are taking this decision so that I can instrument the appropriate library calls in PIN?

What's PIN?

The decision is made in the ob1 PML plugin.  Way back during MPI_INIT, each MPI process creates lists of BTLs to use to contact each MPI process peer.

When a process is on the same *node* (e.g., a single server) -- not just the same processor socket/chip -- the shared memory BTL is given preference to all other BTLs by use of a priority mechanism.  Hence, the "sm" BTL is put at the front of the BML lists (BML = BTL multiplexing layer -- it's essentially just list management for BTLs).

Later, when MPI_SEND comes through, it uses the already-setup BML lists to determine which BTL(s) to use to send a message.

That's the 50,000 foot view.

--
Jeff Squyres
jsquyres@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Shamik Ganguly




--
Shamik Ganguly
2nd year, MS (CSE-Hardware), University of Michigan, Ann Arbor
B.Tech.(E&ECE), IITKGP (2008)