Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] OMPI looking for PBS file?
From: John R. Cary (cary_at_[hidden])
Date: 2010-03-14 16:34:05


I have a script that launches a bunch of runs on some compute nodes of
a cluster. Once I get through the queue, I query PBS for my machine
file, then I copy that to a local file 'nodes' which I use for mpiexec:

mpiexec -machinefile /home/research/cary/projects/vpall/vptests/nodes
-np 6 /hom
e/research/cary/projects/vpall/builds/vorpal/par/vorpal/vorpal -i
bathtubAntenna
.in -dim 2 -o bathtubAntenna2p -n 100 -d 100

but this fails with

[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file
../../../
../../orte/mca/ras/tm/ras_tm_module.c at line 153
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file
../../../
../../orte/mca/ras/tm/ras_tm_module.c at line 87
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file
../../../
../orte/mca/ras/base/ras_base_allocate.c at line 133
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file
../../../
../orte/mca/plm/base/plm_base_launch_support.c at line 72
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file
../../../
../../orte/mca/plm/tm/plm_tm_module.c at line 167
--------------------------------------------------------------------------
A daemon (pid unknown) died unexpectedly on signal 1 while attempting to
launch so we are aborting.

The appropriate code snippet is

     /* setup the full path to the PBS file */
     filename = opal_os_path(false, mca_ras_tm_component.nodefile_dir,
                             pbs_jobid, NULL);
     fp = fopen(filename, "r");
     if (NULL == fp) {
         ORTE_ERROR_LOG(ORTE_ERR_FILE_OPEN_FAILURE);
         free(filename);
         return ORTE_ERR_FILE_OPEN_FAILURE;
     }

which kind of looks like it might be trying to open my pbs file instead
of the file I gave on the command line? I really don't know, but does
anyone have any ideas here?

Thx....John Cary