There are tools available to allow you to see the "message queues" of a
process, this might help you identify why you aren't seeing the messages
that you are waiting on complete. One such tool is linked to in my
signature, you could also look into TotalView or DDT as well.
I would also suggest that as you are seeing random hangs and crashes
running your code under Valgrind might be advantageous.
On Sun, 2009-09-27 at 02:05 +0800, guosong wrote:
> Yes, I know there should be a bug. But I do not know where and why.
> The strange thing was sometimes it worked but at this time there will
> be a segmentation fault. If it did not work, some process must sit
> there waiting for the message. There are many iterations in my
> program(using a loop). It would after a few iterations the "bug" would
> appear, which means the previous a few iterations the communication
> worked. I am quite comfused now.
Ashley Pittman, Bath, UK.
Padb - A open source job inspection tool for parallel computing