Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bug? openMPI interpretation of SLURM environment variables
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-08-24 14:34:06


Haven't seen that before on any of our machines.

Could you do "printenv | grep SLURM" after the salloc and send the results?

What version of SLURM is this?

Please run "mpirun --display-allocation hostname" and send the results.

Thanks
Ralph

On Mon, Aug 24, 2009 at 11:30 AM, <matthew.piehl_at_[hidden]> wrote:

> Hello,
>
> I've seem to run into an interesting problem with openMPI. After
> allocating 3 processors and confirming that the 3 processors are
> allocated. mpirun on a simple mpitest program seems to run on 4
> processors. We have 2 processors per node. I can repeat this case with any
> odd number of nodes, openMPI seems to take any remaining processors on the
> box. We are running openMPI v1.3.3. Here is an example of what happens:
>
> node64-test ~>salloc -n3
> salloc: Granted job allocation 825
>
> node64-test ~>srun hostname
> node64-28.xxxx.xxxx.xxxx.xxxx
> node64-28.xxxx.xxxx.xxxx.xxxx
> node64-29.xxxx.xxxx.xxxx.xxxx
>
> node64-test ~>MX_RCACHE=0
> LD_LIBRARY_PATH="/hurd/mpi/openmpi/lib:/usr/local/mx/lib" mpirun
> mpi_pgms/mpitest
> MPI domain size: 4
> I am rank 000 - node64-28.xxxx.xxxx.xxxx.xxxx
> I am rank 003 - node64-29.xxxx.xxxx.xxxx.xxxx
> I am rank 001 - node64-28.xxxx.xxxx.xxxx.xxxx
> I am rank 002 - node64-29.xxxx.xxxx.xxxx.xxxx
>
>
>
> For those who may be curious here is the program:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <mpi.h>
>
> extern int main(int argc, char *argv[]);
>
> extern int main(int argc, char *argv[])
>
> {
> auto int rank,
> size,
> namelen;
>
> MPI_Status status;
>
> static char processor_name[MPI_MAX_PROCESSOR_NAME];
>
> MPI_Init(&argc, &argv);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Comm_size(MPI_COMM_WORLD, &size);
>
> if ( rank == 0 )
> {
> MPI_Get_processor_name(processor_name, &namelen);
> fprintf(stdout,"My name is: %s\n",processor_name);
> fprintf(stdout,"Cluster size is: %d\n", size);
>
> }
> else
> {
> MPI_Get_processor_name(processor_name, &namelen);
> fprintf(stdout,"My name is: %s\n",processor_name);
> }
>
> MPI_Finalize();
> return(0);
> }
>
>
> I'm curious if this is a bug in the way openMPI interprets SLURM
> environment variables. If you have any ideas or need any more information
> let me know.
>
>
> Thanks.
> Matt
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>