Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Migrate OpenMPI to the VxWorks
From: Õž§ (iam.chilli_at_[hidden])
Date: 2010-03-22 22:06:41


Hi Ralph ,

Thank you for your immediate and useful help . I will try out what you have
posted to see if the porting can be successful .

Regards,

Jing Zhang

2010/3/18 Ralph Castain <rhc_at_[hidden]>

> Hi Jing
>
> Someone else took a look at this off-list a few years ago. It was mostly a
> problem with the build system (some flags are different) and header file
> names. I don't believe the port was ever completed though.
>
> I have appended the results of that conversation - the last message
> contained a list of the issues. You would need to update that to the trunk
> of course as the code has changed considerably since that discussion took
> place. Brian Barrett subsequently created a first-cut at fixing some of
> these, but that appears to have been lost in the years since it was done -
> and wouldn't really be current anyway.
>
> I would be happy to assist as I can.
> Ralph
>
> 1. configure issues with "checking prefix for global symbol labels"
>
>
> 1a. VxWorks assembler (CCAS=asppc) generates a.out by default (vs.
>
> conftest.o that we need subsequently)
>
>
> there is this fragment to determine the way to assemble conftest.s:
>
>
> if test "$CC" = "$CCAS" ; then
>
> ompi_assemble="$CCAS $CCASFLAGS -c conftest.s >conftest.out 2>&1"
>
> else
>
> ompi_assemble="$CCAS $CCASFLAGS conftest.s >conftest.out 2>&1"
>
> fi
>
>
> The subsequent link fails because conftest.o does not exist:
>
>
> ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o conftest
> > conftest.link 2>&1"
>
>
> To work around the problem, I did not set CCAS. This gives me the first
>
> invocation that includes the -c argument to CC=ccppc, generating
>
> conftest.o output.
>
>
>
> 1b. linker fails because LDFLAGS are not passed
>
>
> The same linker command line caused problems because $CFLAGS were passed
>
> to the linker
>
>
> ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o conftest
> > conftest.link 2>&1"
>
>
> In my environment, I set CC/CFLAGS/LDFLAGS as follows:
>
> CC=ccppc
>
>
> CFLAGS=-ggdb3 -std=c99 -pedantic -mrtp -msoft-float -mstrict-align
>
> -mregnames -fno-builtin -fexceptions'
>
>
> LDFLAGS=-mrtp -msoft-float -Wl,--start-group -Wl,--end-group
>
>
> -L/amd/raptor/root/opt/WindRiver/vxworks-6.3/target/usr/lib/ppc/PPC32/sfcommon
>
>
> The linker flags are not passed because the ompi_link
>
>
> [xp-kcain1:build_vxworks] ccppc -ggdb3 -std=c99 -pedantic -mrtp
>
> -msoft-float -mstrict-align -mregnames -fno-builtin -fexceptions -o
>
> hello hello.c
>
>
> /amd/raptor/root/opt/WindRiver/gnu/3.4.4-vxworks-6.3/x86-linux2/bin/../lib/gcc/powerpc-wrs-vxworks/3.4.4/../../../../powerpc-wrs-vxworks/bin/ld:
>
>
>
> cannot find -lc_internal
>
> collect2: ld returned 1 exit status
>
>
>
> 2. OPAL atomics asm.c:
>
> int versus int32_t (refer to email with Brian Barrett
>
>
> 3. OPAL event/event.c: sys/time.h and timercmp() macros not defined by
>
> VxWorks
>
> refer to workaround in event.c using #ifdef MCS_VXWORKS
>
>
> 4. OPAL event/event.c: pipe() syscall not found
>
> workaround:
>
>
> #ifdef HAVE_UNISTD_H
>
> #include <unistd.h>
>
> #ifdef MCS_VXWORKS
>
> #include <ioLib.h> /* for pipe() */
>
> #endif
>
> #endif
>
>
> 5. OPAL event/signal.c
>
> static sig_atomic_t opal_evsigcaught[NSIG];
>
> NSIG is not defined
>
> but _NSIGS is
>
>
> In Linux, NSIG is defined with -D__USE_MISC
>
>
> So I added this code fragment to signal.c:
>
>
> /* VxWorks signal.h defines _NSIGS, not NSIG */
>
> #ifdef MCS_VXWORKS
>
> #define NSIG (_NSIGS+1)
>
> #endif
>
>
>
> 6. OPAL event/signal.c: no socketpair()
>
>
> workaround: use pipe():
>
>
> #ifdef HAVE_UNISTD_H
>
> #include <unistd.h>
>
> #ifdef MCS_VXWORKS
>
> #include <ioLib.h> /* for pipe() */
>
> #endif
>
> #endif
>
>
> and later in void opal_evsignal_init(sigset_t *evsigmask)
>
>
> #ifdef MCS_VXWORKS
>
> if (pipe(ev_signal_pair) == -1)
>
> event_err(1, "%s: pipe", __func__);
>
> #else
>
> if (socketpair(AF_UNIX, SOCK_STREAM, 0, ev_signal_pair) == -1)
>
> event_err(1, "%s: socketpair", __func__);
>
> #endif
>
>
> 7. OPAL util/basename.c: #if HAVE_DIRNAME problem
>
>
> ../../../opal/util/basename.c:23:5: warning: "HAVE_DIRNAME" is not
> defined
>
> ../../../opal/util/basename.c: In function `opal_dirname':
>
>
> problem: HAVE_DIRNAME is not defined in opal_config.h so the #if
>
> HAVE_DIRNAME will fail at preprocessor/compile time
>
>
> workaround:
>
> change #if HAVE_DIRNAME to #if defined(HAVE_DIRNAME)
>
>
>
> 8. OPAL util/basename.c: strncopy_s and _strdup
>
> ../../../opal/util/basename.c: In function `opal_dirname':
>
> ../../../opal/util/basename.c:153: error: implicit declaration of
>
> function `strncpy_s'
>
> ../../../opal/util/basename.c:160: error: implicit declaration of
>
> function `_strdup'
>
>
> #ifdef MCS_VXWORKS
>
> strncpy( ret, filename, p - filename);
>
> #else
>
> strncpy_s( ret, (p - filename + 1), filename, p - filename
> );
>
> #endif
>
> #ifdef MCS_VXWORKS
>
> return strdup(".");
>
> #else
>
> return _strdup(".");
>
> #endif
>
>
>
>
> 9. opal/util/if.c: socket() prototype not found in vxworks headers
>
>
> #ifdef HAVE_SYS_SOCKET_H
>
> #include <sys/socket.h>
>
> #ifdef MCS_VXWORKS
>
> #include <sockLib.h>
>
> #endif
>
> #endif
>
>
> 10. opal/util/if.c: ioctl()
>
>
> #ifdef HAVE_SYS_IOCTL_H
>
> #include <sys/ioctl.h>
>
> #ifdef MCS_VXWORKS
>
> #include <ioLib.h>
>
> #endif
>
> #endif
>
>
> 11. opal/util/os_path.c: MAXPATHLEN change to PATH_MAX
>
>
> #ifdef MCS_VXWORKS
>
> if (total_length > PATH_MAX) { /* path length is too long - reject
>
> it */
>
> return(NULL);
>
> #else
>
> if (total_length > MAXPATHLEN) { /* path length is too long -
>
> reject it */
>
> return(NULL);
>
> #endif
>
>
>
> 12. opal/util/output.c: gethostname()
>
> include <hostLib.h>
>
>
> 13. opal/util/output.c: MAXPATHLEN
>
> same fix as os_path.c above
>
>
> 14. opal/util/output.c: closelog/openlog/syslog
>
> manually turned off HAVE_SYSLOG_H in opal_config.h
>
> then got a patch from Jeff Squyres that avoids syslog
>
>
> 15. opal/util/opal_pty.c
>
> complains about mismatched prototype of opal_openpty() between this
>
> source file and opal_pty.h
>
>
> workaround: manually edit build_vxworks_ppc/opal/include/opal_config.h,
>
> use the following line (change 1 to 0):
>
> #define OMPI_ENABLE_PTY_SUPPORT 0
>
>
> 16. opal/util/stacktrace.c
>
> FPE_FLTINV not present in signal.h
>
>
> workaround: edit opal_config.h to turn off
>
> OMPI_WANT_PRETTY_PRINT_STACKTRACE (this can be explicitly configured out
>
> but I don't want to reconfigure because I hacked #15 above)
>
>
> 17. opal/mca/base/mca_base_open.c
>
> gethostname() -- same as opal/util/output.c, must include hostLib.h
>
>
> 18. opal_progress.c
>
> from opal/event/event.h (that I modified earlier)
>
> cannot find #include <sys/_timeradd.h>
>
> It is in opal/event/compat/sys
>
>
> workaround: change event.h to include the definitions that are present
>
> in _timeradd.h instead of including it.
>
>
> 19. Link errors for opal_wrapper
>
> strcasecmp
>
> strncasecmp
>
>
> I rolled my own in mca_base_open.c (temporary fix, since we may come
> across this problem elsewhere in the code).
>
>
> 20. dss_internal.h uses a type 'uint'
>
> Not sure if it's depending on something in the headers, or something it
>
> defined on its own.
>
>
> I changed it to be just like the header I found somewhere under Linux
> /usr/include:
>
> #ifdef MCS_VXWORKS
>
> typedef unsigned int uint;
>
> #endif
>
>
> 21. struct iovec definition needed
>
> orte/mca/iof/base/iof_base_fragment.h:45: warning: array type has
>
> incomplete element type
>
>
> #ifdef MCS_VXWORKS
>
> #include <net/uio.h>
>
> #endif
>
>
> not sure if this is right, or if I should include something like
>
> <netBufLib.h> or <ioLib.h>
>
>
>
> 22. iof_base_setup.c
>
> struct termios not understood
>
> can only find termios.h header in 'diab' area and I'm not using that
>
> compiler.
>
>
> a variable usepty is set to 0 already when OMPI_ENABLE_PTY_SUPPORT is 0.
>
> So, why are we compiling this fragment of code at all? I hacked the file
>
> so that the struct termios code will not get compiled.
>
>
> 23. oob_base_send/recv.c, oob_base_send/recv_nb.c. struct iovec not
> known.
>
>
> #ifdef MCS_VXWORKS
>
> #include <net/uio.h>
>
> #endif
>
>
> 24. orte/mca/rmgr/base/rmgr_base_check_context.c:58: error:
>
> `MAXHOSTNAMELEN' undeclared (first use in this function)
>
>
> #ifdef MCS_VXWORKS
>
> #define MAXHOSTNAMELEN 64
>
> #endif
>
>
> 25. orte/mca/rmgr/base/rmgr_base_check_context.c:58:
>
> gethostname()
>
>
> #ifdef MCS_VXWORKS
>
> #include <hostLib.h>
>
> #endif
>
>
> 26. orte/mca/iof/proxy/iof_proxy.h:135: warning: array type has
>
> incomplete element type
>
> ../../../../../orte/mca/iof/proxy/iof_proxy.h:135: error: field
>
> `proxy_iov' has incomplete type
>
>
> #ifdef MCS_VXWORKS
>
> #include <net/uio.h>
>
> #endif
>
>
> 27. /orte/mca/iof/svc/iof_svc.h:147: warning: array type has incomplete
>
> element type
>
> ../../../../../orte/mca/iof/svc/iof_svc.h:147: error: field `svc_iov'
>
> has incomplete type
>
>
> #ifdef MCS_VXWORKS
>
> #include <net/uio.h>
>
> #endif
>
>
> 28. ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: warning: array
>
> type has incomplete element type
>
> ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: error: field `msg_iov'
>
> has incomplete type
>
> ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h: In function
>
> `mca_oob_tcp_msg_iov_alloc':
>
> ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:196: error: invalid
>
> application of `sizeof' to incomplete type `iovec'
>
>
>
> 29. ../../../../../orte/mca/oob/tcp/oob_tcp.c:344: error: implicit
>
> declaration of function `accept'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
> `mca_oob_tcp_create_listen':
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:383: error: implicit
>
> declaration of function `socket'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:399: error: implicit
>
> declaration of function `bind'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:407: error: implicit
>
> declaration of function `getsockname'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:415: error: implicit
>
> declaration of function `listen'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
> `mca_oob_tcp_listen_thread':
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:459: error: implicit
>
> declaration of function `bzero'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
> `mca_oob_tcp_recv_probe':
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:696: error: implicit
>
> declaration of function `send'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
> `mca_oob_tcp_recv_handler':
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:795: error: implicit
>
> declaration of function `recv'
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
> `mca_oob_tcp_init':
>
> ../../../../../orte/mca/oob/tcp/oob_tcp.c:1087: error: implicit
>
> declaration of function `usleep'
>
>
> This gets rid of most (except bzero and usleep)
>
> #ifdef MCS_VXWORKS
>
> #include <sockLib.h>
>
> #endif
>
>
> Trying to reconfigure the package so CFLAGS will not include -pedantic.
>
> This is because $WIND_HOME/vxworks-6.3/target/h/string.h has protos for
>
> bzero, but only when #if _EXTENSION_WRS is true. So turn off
>
> -ansi/-pedantic gets this? In my dreams?
>
> On Mar 17, 2010, at 9:54 PM, Õž§ wrote:
>
> Hello all,
>
>
> In order to add some real-time feature to the OpenMPI for some research ,I
> need a OpenMPI version running on VxWorks. But after going through the
> Open-MPI website ,I can¡¯t found any indication that it supports VxWorks .
>
>
> Follow the thread posted by Ralph Castain ,
> http://www.open-mpi.org/community/lists/users/2006/06/1371.php .
> I read some paper about the OpenRTE ,like ¡°Creating a transparent,
> distributed, and resilient computing environment: the OpenRTE project¡± and
> ¡°The Open Run-Time Environment (OpenRTE):A Transparent Multi-cluster
> Environment for High-Performance Computing¡±which is written by Ralph H.
> Castain ¡¤ Jeffrey M. Squyres and others .
>
>
> Now I have a basic understanding of the OpenRTE , however ,there is too few
> document of the OpenRTE describing the implement of the OpenRTE . I don¡¯t
> know
> where and how to begin the migration . Any advice will be appreciated.
>
>
>
>
> Thanks
>
>
> Jing Zhang
>
> _______________________________________________
>
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Õž§