-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi folks,
At SC10 this year there was an interesting tool presented
as a student paper called "FlowChecker: Detecting Bugs in
MPI Libraries via Message Flow Checking".
http://sc10.supercomputing.org/schedule/event_detail.php?evid=pap352
Basically they instrument a program and derive "intentions"
from your MPI calls and the MPI standard and also trace the
data flow (including things like memcpy) and messages.Then
offline you run a correlator which compares what was meant
to happen and what did and tries to root cause the fault.
They claim to have taken 5 random closed bugs from 3 different
MPI implementations (including 3 from Open-MPI) and been able
to detect all 5 and root-cause 4 of them (the one they missed
was a data type issue).
The PDF of their paper is here:
http://www.cse.ohio-state.edu/~chenzhe/sc10-flowchecker.pdf
I've emailed them to see if the code is going to be available
as it could be quite a handy tool to have when trying to track
down issues like the one Sébastien posted about.
cheers,
Chris
- --
Christopher Samuel - Senior Systems Administrator
VLSCI - Victorian Life Sciences Computational Initiative
Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
http://www.vlsci.unimelb.edu.au/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAkzn884ACgkQO2KABBYQAh+jAQCggP+izYq3rkSo1hPzADi2vCEI
z2QAmwX5oEYpgYYlc6ZWC3Pr3q1dBGp/
=2KB+
-----END PGP SIGNATURE-----
|