Thanks for your suggestions. I had already tested for which threads were reaching the Finalize() call and all of them are. Also, the Finalize() call is not inside a conditional. This seems to suggest there may be a prior communication left unfinished, but based on the documentation I have read I would think the Finalize() routine would error/exception out in that situation. It seems significant that the software was performing as expected under the previous OS and OpenMPI versions (although, the older OpenMPI version is only slightly older than what is being used now), but I don't know yet what the differences are.
Is there any other information I could provide that might be useful?
From: Hazelrig, Chris CTR (US)
Sent: Tue 8/13/2013 1:51 PM
Subject: Finalize() does not return
I am using OpenMPI 1.4.3-1.1.el6 on RedHawk Linux 6.0.1 (Glacier) / RedHat Enterprise Linux Workstation Release 6.1 (Santiago). I am currently working through some issues that I encountered after upgrading from RedHawk 5.2 / RHEL 5.2 and OpenMPI 1.4.3-1 (openmpi-gcc_1.4.3-1). It seems that since the upgrades my software does not return from the call to the Finalize() routine. All threads enter the Finalize() routine and never return. I wrote a simple test program to try to simplify troubleshooting and Finalize() works as expected, i.e., all threads return from the Finalize() call. This suggests the problem is in my code. I have searched the man pages and user forums to no avail. Has anyone else encountered this problem? What could cause such behavior? I wondered if maybe there is still some prior communication that was left unfinished, but I believe I have verified that is not the case, plus my understanding of how Finalize() works is that it would error/exception out in such a situation rather than just sit there, but I could be wrong.
Not sure what additional information may be needed by the community to aid in troubleshooting, but will be happy to provide whatever else is needed.