Sorry, this mail slipped by me.
The most common reason that I have seen this happen is if you are not
using the TM support in Open MPI to launch the MPI processes on your
allocated nodes.
I do not have a TM system to test with, but I *believe* that TM will
replicate your entire environment (including $PBS_JOBID) out on the
back-end nodes before starting the job.
Are you seeing cases where this is not happening?
More below.
On Jan 5, 2008, at 3:48 AM, Prakash Velayutham wrote:
> Hi,
>
> I am trying to start a simple MPI code below using Open MPI 1.2.4 and
> Torque 2.2.1.
>
> prakash_at_bmi-opt2-04:~/thesis/CS/Samples/changejob> cat pbs.c
> #include <stdio.h>
> #include "mpi.h"
>
> int gdb_var;
>
> void main(argc, argv)
> int argc;
> char **argv;
> {
> int rank, size, ret;
> gdb_var = 0;
> char *jobid;
> ret = MPI_Init(&argc, &argv);
> if (ret != 0) printf("ERROR with MPI initialization\n");
> ret = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> if (ret != 0) printf("ERROR with MPI ranking\n");
> ret = MPI_Comm_size(MPI_COMM_WORLD, &size);
> if (ret != 0) printf("ERROR with MPI sizes\n");
> if (0 == rank) {
> printf("Host %d ready to attach\n",rank);
> fflush(stdout);
> while (0 == gdb_var) sleep(5);
> jobid = getenv("PBS_JOBID");
> printf("Job id is %s\n", *jobid);
>
I don't think you should be de-referncing jobid here.
>
> if (!jobid)
> error("PBS_JOBID not set in environment. Code must be
> run from a\n"
> " PBS script, perhaps interactively using \"qsub -I
> \"");
> }
> MPI_Finalize();
> }
>
main() is supposed to return an int. ;-)
>
> prakash_at_bmi-opt2-04:~/thesis/CS/Samples/changejob> mpiexec -np 4 --
> prefix /usr/local/openmpi-1.2.4 ./pbs
> prakash_at_bmi-opt2-04:~/thesis/CS/Samples/changejob>
>
Hmm. This output doesn't seem to match the code above...?
>
> As shown above, for some reason, PBS_JOBID is not getting set in the
> MPI's environment, even though it is available at the shell level.
>
> prakash_at_bmi-opt2-04:~/thesis/CS/Samples/changejob> echo $PBS_JOBID
> 18.fructose.cchmc.org
>
> Any ideas why?
>
> Thanks,
> Prakash
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems
|