Could your problem is related to the MCA parameter “contamination” problem, where the child MPI process inherits MCA environment variables from the parent process still exists.

 

Back in 2007 I was implementing a program that solves two large interrelated systems of equations (+200.000.000 eq.) using PCG iteration. The program starts to iterate on the first system until a certain degree of convergence, then the master node executes a shell script which starts the parallel solver on the second system. Again the iteration is to certain degree of convergence, some parameters from solving the second system are stored in files. After the solving of the second system, the stored parameters are used in the solver for the first system. Both before and after the master node makes the system call the nodes are synchronized via calls of MPI_BARRIER.

 

The program was hanging when the master node executed the shell script.

 

I found that it was because MCA environment variables was inherited form the parent process, and solved the problem by adding the following to the script starting the second MPI program:

 

for i in $(env | grep OMPI_MCA |sed 's/=/ /' | awk '{print $1}')

  do

    unset $i

  done

 


Med venlig hilsen / Regards

Per Madsen
Seniorforsker / Senior scientist

 
 
AARHUS UNIVERSITET / UNIVERSITY OF AARHUS
Det Jordbrugsvidenskabelige Fakultet / Faculty of Agricultural Sciences
Inst. for Genetik og Bioteknologi / Dept. of Genetics and Biotechnology
Blichers Allé 20, P.O. BOX 50
DK-8830 Tjele
 
Tel: +45 8999 1900
Direct: +45 8999 1216
Mobile: +45
E-mail: Per.Madsen@agrsci.dk
Web: www.agrsci.dk

DJF udbyder nye uddannelser / DJF now offers new degree programmes.

Tilmeld dig DJF's nyhedsbrev / Subscribe Faculty of Agricultural Sciences Newsletter.

Denne email kan indeholde fortrolig information. Enhver brug eller offentliggørelse af denne email uden skriftlig tilladelse fra DJF er ikke tilladt. Hvis De ikke er den tiltænkte adressat, bedes De venligst straks underrette DJF samt slette emailen.

This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email.