That's strange - I run on slurm frequently and never have this problem, and my default hostfile is present and empty. Do you have anything in your default mca param file that might be telling us to use the hostfile?

The only way I can find to get that behavior is if your default mca param file includes the orte_default_hostfile value. In that case, you are telling us to use the default hostfile, and so we will enforce it.

On Feb 27, 2012, at 5:57 AM, wrote:

Hi all,

I have problems with the openmpi-default-hostfile since the following patch on the trunk

changeset:   19874:088fc6c84a9f
user:        rhc
date:        Wed Feb 01 17:40:44 2012 +0000
summary:     In accordance with prior releases, we are supposed to default to looking at the openmpi-default-hostfile as a default hostfile. Restore that behavior, but ignore the file if it is empty. Allow the user to ignore any MCA param setting pointing to a default hostfile by setting the param to "none" (via cmd line or whatever) - this allows them to override a setting in the system default MCA param file.

According to the summary of this patch, the openmpi-default-hostfile is ignored if it is empty.
But, when I run my jobs with slurm + mpirun, I get the following message:
No nodes are available for this job, either due to a failure to
allocate nodes to the job, or allocated nodes being marked
as unavailable (e.g., down, rebooting, or a process attempting
to be relocated to another node when none are available).

I am able to run my job if:
 - either I put my node(s) in the file etc/openmpi-default-hostfile
 - or use "-mca orte_default_hostfile=none" in the mpirun command line
 - or "export OMPI_MCA_orte_default_hostfile none" in my environment

It appears that an empty openmpi-default-hostfile is not ignored. This patch seems not be complete

 Or do I misunderstand something ?

Pascal Devèze_______________________________________________
devel mailing list