On Fri, Mar 5, 2010 at 12:03 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
> This type of failure is usually due to prelink'ing being left enabled on one
> or more of the systems. This has come up multiple times on the Open MPI
> list, but is actually a problem between BLCR and the Linux kernel. BLCR has
> a FAQ entry on this that you will want to check out:
> If that does not work, then we can look into other causes.
I also suggest checkpointing and restarting the app with BLCR
directly. I.e., take any simple app, run it with cr_run, checkpoint it
with cr_checkpoint then restart it with cr_restart. Make sure the blcr
module is loaded too. That way you can tell whether it's related to
OpenMPI or not.