Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problem using openmpi with DMTCP
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-09-29 09:20:56


If you're integrating a new checkpoint/restart system inside Open MPI,
you probably want to re-send this mail to the devel list to get the
attention of the right people who can help you.

On Sep 28, 2009, at 11:55 AM, Kritiraj Sajadah wrote:

> Dear All,
> I am trying to integrate DMTCP with openmpi. IF I run a c
> application, it works fine. But when I execute the program using
> mpirun, It checkpoints application but gives error when restarting
> the application.
>
> #############
> [31007] WARNING at connection.cpp:303 in restore;
> REASON='JWARNING((_sockDomain == AF_INET || _sockDomain == AF_UNIX )
> && _sockType == SOCK_STREAM) failed'
> id() = 2ab3f248-30933-4ac0d75a(99007)
> _sockDomain = 10
> _sockType = 1
> _sockProtocol = 0
> Message: socket type not yet [fully] supported
> [31007] WARNING at connection.cpp:303 in restore;
> REASON='JWARNING((_sockDomain == AF_INET || _sockDomain == AF_UNIX )
> && _sockType == SOCK_STREAM) failed'
> id() = 2ab3f248-30943-4ac0d75c(99007)
> _sockDomain = 10
> _sockType = 1
> _sockProtocol = 0
> Message: socket type not yet [fully] supported
> [31013] WARNING at connection.cpp:87 in restartDup2;
> REASON='JWARNING(_real_dup2 ( oldFd, fd ) == fd) failed'
> oldFd = 537
> fd = 1
> (strerror((*__errno_location ()))) = Bad file descriptor
> [31013] WARNING at connectionmanager.cpp:627 in closeAll;
> REASON='JWARNING(_real_close ( i->second ) ==0) failed'
> i->second = 537
> (strerror((*__errno_location ()))) = Bad file descriptor
> [31015] WARNING at connectionmanager.cpp:627 in closeAll;
> REASON='JWARNING(_real_close ( i->second ) ==0) failed'
> i->second = 537
> (strerror((*__errno_location ()))) = Bad file descriptor
> [31017] WARNING at connectionmanager.cpp:627 in closeAll;
> REASON='JWARNING(_real_close ( i->second ) ==0) failed'
> i->second = 537
> (strerror((*__errno_location ()))) = Bad file descriptor
> [31007] WARNING at connectionmanager.cpp:627 in closeAll;
> REASON='JWARNING(_real_close ( i->second ) ==0) failed'
> i->second = 537
> (strerror((*__errno_location ()))) = Bad file descriptor
> MTCP: mtcp_restart_nolibc: mapping current version of /usr/lib/gconv/
> gconv-modules.cache into memory;
> _not_ file as it existed at time of checkpoint.
> Change mtcp_restart_nolibc.c:634 and re-compile, if you want
> different behavior.
> [31015] ERROR at connection.cpp:372 in restoreOptions;
> REASON='JASSERT(ret == 0) failed'
> (strerror((*__errno_location ()))) = Invalid argument
> fds[0] = 6
> opt->first = 26
> opt->second.size() = 4
> Message: restoring setsockopt failed
> Terminating...
> #############################################################
>
> Any suggestions is very welcomed.
>
> regards,
>
> Raj
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Jeff Squyres
jsquyres_at_[hidden]