Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] [warn] Epoll ADD(1) on fd 0 failed
From: Mike Dubman (miked_at_[hidden])
Date: 2014-06-10 15:34:36


btw, the output comes from ompi`s libevent and not from slurm itself (sorry
about confusion and thanks to Yossi for catching this)

opal/mca/event/libevent2021/libevent/epoll.c:
event_warn("Epoll %s(%d) on fd %d failed. Old events were %d; read change
was %d (%s); write change was %d (%s)",
opal/mca/event/libevent2021/libevent/epoll.c:
event_debug(("Epoll %s(%d) on fd %d okay. [old events were %d; read change
was %d; write change was %d]",

On Fri, Jun 6, 2014 at 3:38 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Possible - honestly don't know
>
> On Jun 6, 2014, at 12:16 AM, Timur Ismagilov <tismagilov_at_[hidden]> wrote:
>
> Sometimes, after termination of the program, launched with the command
> "sbatch ... -o myprogram.out .....", no file "myprogram.out" is being
> produced. Could this be due to the above mentioned problem?
>
>
> Thu, 5 Jun 2014 07:45:01 -0700 от Ralph Castain <rhc_at_[hidden]>:
>
> FWIW: support for the --resv-ports option was deprecated and removed on
> the OMPI side a long time ago.
>
> I'm not familiar enough with "oshrun" to know if it is doing anything
> unusual - I believe it is just a renaming of our usual "mpirun". I suspect
> this is some interaction with sbatch, but I'll take a look. I haven't see
> that warning. Mike indicated he thought it is due to both slurm and OMPI
> trying to control stdin/stdout, in which case it shouldn't be happening but
> you can safely ignore it
>
>
> On Jun 5, 2014, at 3:04 AM, Timur Ismagilov <tismagilov_at_[hidden]> wrote:
>
> I use cmd line
>
> $sbatch -p test --exclusive -N 2 -o hello_oshmem.out -e hello_oshmem.err
> shrun_mxm3.0 ./hello_oshmem
>
> where script shrun_mxm3.0:
>
> $cat shrun_mxm3.0
>
> #!/bin/sh
>
> #srun --resv-ports "$@"
> #exit $?
>
> [ x"$TMPDIR" == x"" ] && TMPDIR=/tmp
> HOSTFILE=${TMPDIR}/hostfile.${SLURM_JOB_ID}
> srun hostname -s|sort|uniq -c|awk '{print $2" slots="$1}' > $HOSTFILE ||
> { rm -f $HOSTFILE; exit 255; }
>
> LD_PRELOAD=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/lib/libmxm.so
> oshrun -x LD_PRELOAD -x MXM_SHM_KCOPY_MODE=off --hostfile $HOSTFILE "$@"
>
> rc=$?
> rm -f $HOSTFILE
>
> exit $rc
>
> I configured openmpi using
>
> ./configure CC=icc CXX=icpc F77=ifort FC=ifort
> --prefix=/mnt/data/users/dm2/vol3/semenov/_scratch/openmpi-1.8.1_mxm-3.0
> --with-mxm=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/ --with-
> slurm --with-platform=contrib/platform/mellanox/optimized
>
>
> Fri, 30 May 2014 07:09:54 -0700 от Ralph Castain <rhc_at_[hidden]>:
>
> Can you pass along the cmd line that generated that output, and how OMPI
> was configured?
>
> On May 30, 2014, at 5:11 AM, Тимур Исмагилов <tismagilov_at_[hidden]> wrote:
>
> Hello!
>
> I am using Open MPI v1.8.1 and slurm 2.5.6.
>
> I got this messages when i try to run example (hello_oshmem.cpp) program:
>
> [warn] Epoll ADD(1) on fd 0 failed. Old events were 0; read change was 1
> (add); write change was 0 (none): Operation not permitted
> [warn] Epoll ADD(4) on fd 1 failed. Old events were 0; read change was 0
> (none); write change was 1 (add): Operation not permitted
> Hello, world, I am 0 of 2
> Hello, world, I am 1 of 2
>
> What does this warnings mean?
>
> I lunch this job using sbatch and mpirun with hostfile (got it from :
> $srun hostname -s|sort|uniq -c|awk '{print $2" slots="$1}' > $HOSTFILE)
>
> Regards,
> Timur
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>