Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Bug? openMPI interpretation of SLURM environment variables
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-08-24 14:34:06


Haven't seen that before on any of our machines.

Could you do "printenv | grep SLURM" after the salloc and send the results?

What version of SLURM is this?

Please run "mpirun --display-allocation hostname" and send the results.

Thanks
Ralph

On Mon, Aug 24, 2009 at 11:30 AM, <matthew.piehl_at_[hidden]> wrote:

> Hello,
>
> I've seem to run into an interesting problem with openMPI. After
> allocating 3 processors and confirming that the 3 processors are
> allocated. mpirun on a simple mpitest program seems to run on 4
> processors. We have 2 processors per node. I can repeat this case with any
> odd number of nodes, openMPI seems to take any remaining processors on the
> box. We are running openMPI v1.3.3. Here is an example of what happens:
>
> node64-test ~>salloc -n3
> salloc: Granted job allocation 825
>
> node64-test ~>srun hostname
> node64-28.xxxx.xxxx.xxxx.xxxx
> node64-28.xxxx.xxxx.xxxx.xxxx
> node64-29.xxxx.xxxx.xxxx.xxxx
>
> node64-test ~>MX_RCACHE=0
> LD_LIBRARY_PATH="/hurd/mpi/openmpi/lib:/usr/local/mx/lib" mpirun
> mpi_pgms/mpitest
> MPI domain size: 4
> I am rank 000 - node64-28.xxxx.xxxx.xxxx.xxxx
> I am rank 003 - node64-29.xxxx.xxxx.xxxx.xxxx
> I am rank 001 - node64-28.xxxx.xxxx.xxxx.xxxx
> I am rank 002 - node64-29.xxxx.xxxx.xxxx.xxxx
>
>
>
> For those who may be curious here is the program:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <mpi.h>
>
> extern int main(int argc, char *argv[]);
>
> extern int main(int argc, char *argv[])
>
> {
> auto int rank,
> size,
> namelen;
>
> MPI_Status status;
>
> static char processor_name[MPI_MAX_PROCESSOR_NAME];
>
> MPI_Init(&argc, &argv);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Comm_size(MPI_COMM_WORLD, &size);
>
> if ( rank == 0 )
> {
> MPI_Get_processor_name(processor_name, &namelen);
> fprintf(stdout,"My name is: %s\n",processor_name);
> fprintf(stdout,"Cluster size is: %d\n", size);
>
> }
> else
> {
> MPI_Get_processor_name(processor_name, &namelen);
> fprintf(stdout,"My name is: %s\n",processor_name);
> }
>
> MPI_Finalize();
> return(0);
> }
>
>
> I'm curious if this is a bug in the way openMPI interprets SLURM
> environment variables. If you have any ideas or need any more information
> let me know.
>
>
> Thanks.
> Matt
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>